> [!info]
> Input: [[Social Media Account|account]]
> Output: [[Face|face picture]], [[Voice|voice pattern]]
>
> Types: [[Behavioural Weakness|behavioural]]
> Weakness: [[SOWEL-3. Creating Content]]
> Functionality: [[SOFL-9. Search]]
### Explanation
A target's public profile typically contains a sizeable corpus of biometric raw material — selfies in the avatar and posts, group photos in the feed, video clips with the target's voice on the audio track, even screen recordings that catch a fingerprint on a touchscreen — none of which was uploaded with biometric search in mind. The investigator's job is to extract the largest, highest-quality sample of each modality before trying to use it elsewhere.
Walk through the account systematically: the avatar at full resolution, every story and highlight, every post containing the target's face or voice, tagged-in photos posted by friends (which often have angles the target would never post themselves), and any video upload (extract the audio track separately for voice work). Pull originals rather than thumbnails — face- and voice-recognition systems are surprisingly sensitive to compression artefacts, and a downscaled JPEG can drop a match that would have hit on the original.
The output is a dataset (face crops, voice clips of at least a few seconds each, ideally with diverse angles and lighting conditions) that becomes the input for [[SOTL-3.11. Search Accounts by Biometric Data]] and the corpus that any subsequent human-review verification step will work from.
### Examples
{{some links to articles, videos, etc}}
### Tools
- [yt-dlp](https://github.com/yt-dlp/yt-dlp) — download original-quality videos from Instagram, YouTube, TikTok, Facebook
- [Instaloader](https://instaloader.github.io/) — bulk-download Instagram profile media, stories and highlights
- [ffmpeg](https://ffmpeg.org/) — extract the audio track from any downloaded video for voice work
- [ExifTool](https://github.com/exiftool/exiftool) — recover capture time and device metadata that help date each sample
- [face_recognition](https://github.com/ageitgey/face_recognition) — Python library for detecting and cropping face regions out of group photos before downstream use
### See also
- [[SOTL-3.11. Search Accounts by Biometric Data]]