We subjected the Vivo X Fold to our rigorous SBMARK audio test suite to measure its performance both when recording sound using its built-in microphones, and when playing audio through its speakers.
In this review, we’ll analyze how it performed in a variety of tests and several common use cases.
Overview
Key audio specs include:
- Two speakers (top side, bottom side)
- No audio jack output
Reproduction
Pros
- Good width, but could be better considering the large size of the device when open
- Pleasant low-end extension
- Excellent delivery of the attack and quite good punch
Against
- Inconsistent tonal balance
- Strong compression and pumping, along with bass distortion at maximum volume
- Not quite understandable at softer volumes
Registration
Pros
- Excellent immersive spatial performance, extremely impressive stereo width and perfect locability
- Good tonal balance with natural restitution of the voices
Against
- SNR could be better in all apps used for testing
- Quite sensitive to microphone occlusions
With a SBMARK Audio score of 132, the Vivo X Fold performed well overall in our tests. While it proved to be an exceptional device for recording, thanks to the breadth of its stereo recordings, playback results were just above average, leaving testers wanting more, considering the X Fold’s size and price.
In audio reproduction, our experts were pleased with a deep bass delivery that compliments an enjoyable experience whether listening to music or watching movies or playing games. However, the tonal balance was inconsistent depending on content and volume, and while stereo width was perfectly good, one could hope for something even better.
The X Fold really shined when recording, and while it was best with the main camera, results were also good with the front camera and memo app. The audio recordings were exceptionally engaging, thanks to outstanding stereo width in both horizontal and vertical orientations when opened, and offered a very pleasant and natural sound signature. The Vivo is also equipped with an audio zoom function that is useful in certain situations. However, our experts have found that it could benefit from some tweaking.
Trial summary
Learn about SBMARK audio tests: For scoring and analysis in our smartphone audio reviews, SBMARK engineers perform a series of objective tests and undertake more than 20 hours of perceptual evaluation under controlled laboratory conditions.
(For more details on our playback protocol, click here; for more details on our recording protocol, click here.)
The following section compiles the key elements of our extensive testing and analysis performed in the SBMARK laboratories. Detailed performance evaluations in the form of reports are available upon request. Do not hesitate to contact us.
How the audio playback score is composed
SBMARK engineers test playback through smartphone speakers, the performance of which is evaluated in our labs and under real-life conditions, using apps and preset settings.
The playback performance of the Vivo X Fold was fairly average overall. The timbre was pleasant and benefited from a very nice low end, but the tonal balance could have been smoother, with the upper bass/lower mid region being rather weak compared to the low end extension and clarity of the upper midrange . The midrange was decent, as was the treble, despite a notable lack of top-end extension.
Dynamic performance was quite good, with a sharp and accurate attack most of the time, very decent bass accuracy, and quite a powerful punch.
The breadth of the soundstage created by the internal speakers was very good, but given the device’s large size in its unfolded state, our experts expected even better results, especially in portrait orientation. Individual sound sources can be located quite precisely and the Vivo offers good depth rendering and realistic distance perception.
Listening at the lowest volume level could be difficult as the minimum volume felt a bit too low, however the maximum volume was quite loud. However, there were some unwanted audio artifacts at this volume, including quite strong bass distortion, compression, and pumping. It’s also worth keeping in mind that both speakers can easily be accidentally covered while gaming, so hand placement is critical.
Hear about the playback performance of the smartphone tested in this comparison with some of its competitors:
Recordings of smartphones playing some of our songs at 60 LAeq in an anechoic environment via 2 microphones in AB configuration, at 30 cm
Here’s how the Vivo X Fold fares in playback use cases compared to its competitors:
Playback of use case scores
The Timbre score represents how well a phone reproduces sound across the audible tonal range and takes into account bass, mids, treble, tonal balance, and volume dependency. It is the most important attribute for reproduction.
Frequency response of music reproduction
A 1/12-octave frequency response graph, which measures the loudness of each frequency emitted by your smartphone as it reproduces a pure sine wave in an anechoic environment.
The Dynamics Score measures the accuracy of changes in the energy level of sound sources, such as how accurately a bass note or impact sound of drums is played.
Secondary attributes for spatial testing include pinpointing the location of a specific sound, its positional balance, distance, and amplitude.
The volume score represents the overall volume of a smartphone and how smoothly the volume increases and decreases based on user input.
Here are some sound pressure levels (SPL) measured while playing our sample recordings of hip-hop and classical music at maximum volume:
hip-hop | Classic | |
Vivo X Fold | 74.5 dBA | 72.7 dBA |
Xiaomi Mix Fold 2 | 72.2 dBA | 67.9 dBA |
Samsung Galaxy Z Fold4 | 71.1 dBA | 67.3 dBA |
The following graph shows the gradual changes in volume from minimum to maximum. We expect these changes to be consistent across the range, so that all volume steps match user expectations:
Music volume consistency
This line graph shows the relative loudness of the playback versus the user selected volume step, measured at several volume steps with correlated pink noise in an anechoic box recorded 0.20 meter on axis.
The artifact score measures the extent to which the sound is affected by various types of distortion. The higher the score, the less noticeable sound disturbances are. Distortion can occur due to the sound processing in the device and the quality of the speakers.
Playback Total Harmonic Distortion (maximum volume)
This graph shows total harmonic distortion and noise over the audible frequency range.
It represents the distortion and noise of the device playing our test signal (0 dB Fs, Sweep Sine in an anechoic box at 40cm) at the device’s maximum volume.
How the score of the audio recording is composed
SBMARK engineers test recording by evaluating recorded files on reference audio equipment. These recordings are made in our laboratories and under real-life conditions, using apps and default settings.
In recording, the X Fold delivered excellent results, particularly making good use of its wide body when unfolded. The tonal balance was very pleasing, with a transparent rendering of the vocal content. Main camera recordings had very natural highs and a well rendered midrange. Selfie videos came with good brightness and hence clarity in voices. The signal-to-noise ratio could have been better, especially in urban environments with a lot of background noise, but the clear and precise envelope still allowed for clear understanding of voices.
Spatial performance was excellent, with outstanding amplitude both horizontally and vertically, resulting in very immersive recordings. Voices were perfectly locatable in the audio scene, with an accurate sense of depth. The recording volume was quite high and artifacts were well under control at high sound pressure levels. However, when covering the microphones with their hands, our testers noted muffled-sounding recordings and loud finger clatter. Also, the stereo balance may shift to one side. The rendering of the background was good, thanks to a pleasant and natural tonal balance. However, more bass could have emphasized the dive.
Here’s how the Vivo X Fold fares in recording use cases compared to its competitors:
Use case scoring
The Timbre Score represents how well a phone captures sounds across the audible tonal range and takes into account bass, mids, treble, and tonal balance. It is the most important attribute for registration.
Video frequency response of life
A 1/12-octave frequency response graph, which measures the loudness of each frequency captured by your smartphone while recording a pure sine wave in an anechoic environment.
The Dynamics Score measures the accuracy of changes in the energy level of sound sources, such as how accurately plosives in a voice (p, t, k, for example) are reproduced. The score also considers the signal-to-noise ratio (SNR), such as how loud the lead voice is compared to the background noise.
Secondary attributes for spatial testing include locating a specific sound’s location, positional balance, distance, and amplitude on recorded audio files.
Directness of registration
Smartphone directivity graph while recording test signals using the camera app, with the main camera. It represents the acoustic energy (in dB) on the angle of incidence of the sound source. (Normalized to 0° angle, in front of the device.)
The loudness score represents how loud audio is normalized on recorded files and how well the device handles noisy environments, such as electronic concerts, while recording.
Here are the sound levels recorded in the audio and video files, measured in LUFS (Loudness Unit Full Scale); for reference, we expect volume levels to be above -24 LUFS for recorded content:
Encounter | Videos life | Selfie videos | Memorandum | |
Vivo X Fold | -25.1 LUFS | -17.9 LUFS | -20 LUFS | -20.2 LUFS |
Xiaomi Mix Fold 2 | -25.3 LUFS | -22.8 LUFS | -19.7 LUFS | -20.8 LUFS |
Samsung Galaxy Z Fold4 | -25.8 LUFS | -21.6 LUFS | -22.7 LUFS | -21 LUFS |
The Artifacts score measures the extent to which recorded sounds are affected by various types of distortions. The higher the score, the less noticeable sound disturbances are. Distortions can occur due to in-device sound processing and microphone quality, as well as user handling, such as how the phone is held.
In this audio comparison, you can hear how this smartphone handles wind noise compared to its competitors:
matrix(3) {
[“Vivo X Fold”]=> string(58) “resources/Vivo/XFoldV2.1/VivoXFold_MicrophoneArtifacts.m4a”
[“Xiaomi MIX Fold 2”]=> string(63) “resources/Vivo/XFoldV2.1/XiaomiMIXFold2_MicrophoneArtifacts.m4a”
[“Samsung Galaxy Z Fold4”]=> string(68) “resources/Vivo/XFoldV2.1/SamsungGalaxyZFold4_MicrophoneArtifacts.m4a” }
Recordings of a voice sample with slight background noise, facing a 5 m/s turbulent wind
Background evaluates how smoothly various sounds around a voice blend into the video recording file. For example, when recording a speech at an event, the background shouldn’t interfere with the main vocal, but should provide context of your surroundings.
Start a new Thread