Continuous recording and processing

For "how to do this" questions and answers, notes, instructions
Post Reply
kprtnj
Member
Posts: 5
Joined: 04 Apr 2025 14:43
x 79

Continuous recording and processing

Post by kprtnj »

Hi,
since my neighbor has a new girlfriend, who is quite active (viewtopic.php?p=106595#p106595), I've wanted to find a way to continuously record in the hallway.

So I've bought a cheap mini USB recorder on a famous Chinese website. The recording quality is not really good (probably because there's no real mic) but it has two useful properties:
  • files are timestamped
  • records can be split
So my worflow is:
  • record for ~a day, split by hour
  • copy all files to my computer
  • use Audacity's "pipe" mode to run macros to "truncate silence" above noise level, resulting files have no silence
  • use https://github.com/tyiannak/pyAudioAnalysis's segmentClassifyFile to mark interesting timestamps
  • validate results manually
  • edit files manually (high/low pass, cuts, denoise)
in the end, filtering a day of audio takes me about 5mn of manual work.

Machine learning training

pyAudioAnalysis's training was done as follow:
  • using https://sonicvisualiser.org/ to label segments
  • then splitting using https://github.com/tyiannak/pyAudioAnal ... Extraction featureExtractionFile to generate labeled files
  • the loop below to train and test :

    Code: Select all

    for method in knn svm extratrees gradientboosting randomforest ; do 
        python3 ../pyAudioAnalysis/pyAudioAnalysis/audioAnalysis.py  trainClassifier -i labeled/* --method "$method" -o ../models/$method-classifier.out | tee "../models/${method}-train.txt"
    done
    
svm was definitely the best.

Machine learning file segmentation

Code: Select all

for i in macro-output/*WAV ; do 
    python3 ../pyAudioAnalysis/pyAudioAnalysis/audioAnalysis.py  segmentClassifyFile -i "$i"  --model svm --modelName ../models/svm-classifier3.out | grep ': sex' | grep -v -F '0:00:00 - 0:00:01: sex' > "${i}-analysis.txt"
done
Sample results:
1. false positives (scattered short timestamps)

Code: Select all

0:01:11 - 0:01:12: sex
0:02:49 - 0:02:50: sex
0:05:06 - 0:05:07: sex
0:05:26 - 0:05:27: sex
0:07:03 - 0:07:04: sex
0:08:20 - 0:08:21: sex
2. true positives (dense timestamps, sometimes several seconds long)

Code: Select all

0:00:06 - 0:00:08: sex
0:00:10 - 0:00:11: sex
0:00:13 - 0:00:14: sex
0:00:16 - 0:00:18: sex
0:00:21 - 0:00:22: sex
0:01:10 - 0:01:11: sex
0:01:12 - 0:01:13: sex
0:01:15 - 0:01:16: sex
0:01:21 - 0:01:22: sex
0:01:25 - 0:01:26: sex
0:01:30 - 0:01:31: sex
0:01:42 - 0:01:44: sex
0:01:53 - 0:01:54: sex
0:01:57 - 0:01:58: sex
0:02:01 - 0:02:02: sex
0:02:06 - 0:02:07: sex
0:02:10 - 0:02:11: sex
0:02:13 - 0:02:15: sex
0:04:32 - 0:04:33: sex
0:04:46 - 0:04:47: sex
0:04:48 - 0:04:49: sex
0:06:44 - 0:06:45: sex
Notes

the training works only on the combination of recorder/neighbor voice I think. It would probably be quite easy to train something more advanced (resnet or sth like that) given enough samples.
Post Reply