Audio Preprocessing

Vocal source separation with Demucs (htdemucs)

                
                Last updated: 2026-06-03 00:43
            

4 Separated stems

28,121 Videos processed

Input Raw audio

Process Demucs htdemucs (GPU vocal separation)

Output Isolated vocal tracks

Each audio file is processed through the htdemucs model for source separation. The signal is decomposed into four stems (vocals, bass, drums, other). Only the vocal stem is retained for subsequent steps.

Vocal separation runs on the collaborative machine network with GPU acceleration. Processing is parallelized across available compute nodes.

Click each card above to expand details

Audio tracks extracted from videos are processed by the source separation model Demucs (htdemucs model). Demucs decomposes the audio signal into four components (vocals, bass, drums, other) and only the vocal stem is retained. This step removes background music, jingles and parasitic noise, significantly improving the quality of subsequent speech recognition.

Processing is performed continuously on the collaborators' machine network, with GPU acceleration. Each video is processed automatically upon detection by the scanner.

Demucs (htdemucs)

PyTorch

#	Table	Description	Scale
1	videos	One row per video: ID, channel metadata, views, likes, comments, tags, duration, upload date, political orientation, country, gender.	26,396 rows
2	comments	All comments with author info, like counts, timestamps, nested reply structure, and JSONB analysis column.	9.6M+ rows
3	video_transcripts	Full diarized transcripts with speaker labels and cleaned text versions.	28,121 rows
4	transcription_speakers	Individual speaker segments from diarization, ordered by position within each video.	1,021,611 rows
5	comments_processed	Sentence-level tokenized comments with NER entities (PER, ORG, LOC) and ML prediction columns.	15.3M+ rows
6	transcription_speakers_processed	Sentence-level speaker segments with NER extraction and full annotation suite.	4.8M+ rows

Previous Data Collection Next Transcription & Diarization

All steps

Continuous Observatory

The database is continuously updated: channel scanning, video transcription and annotation, comment extraction, metadata updates (views, likes, subscribers). Each scan produces a longitudinal history accessible via the API.

Last updated: 2026-06-03 00:43

Today

videos transcribed

comments extracted

Since January

videos transcribed

comments extracted

videos detected

metadata updated

channels scanned

Audio Preprocessing

Continuous Update Pipeline

How It Works

Source Separation (Demucs htdemucs)

Distributed GPU-Accelerated Processing

Tools Used

Database Schema

Continuous Observatory