Continuous recording and processing

kprtnj · Post by **kprtnj** » 07 Apr 2025 21:04

Hi,
since my neighbor has a new girlfriend, who is quite active (viewtopic.php?p=106595#p106595), I've wanted to find a way to continuously record in the hallway.

So I've bought a cheap mini USB recorder on a famous Chinese website. The recording quality is not really good (probably because there's no real mic) but it has two useful properties:

files are timestamped
records can be split

So my worflow is:

record for ~a day, split by hour
copy all files to my computer
use Audacity's "pipe" mode to run macros to "truncate silence" above noise level, resulting files have no silence
use https://github.com/tyiannak/pyAudioAnalysis's segmentClassifyFile to mark interesting timestamps
validate results manually
edit files manually (high/low pass, cuts, denoise)

in the end, filtering a day of audio takes me about 5mn of manual work.

Machine learning training

pyAudioAnalysis's training was done as follow:

using https://sonicvisualiser.org/ to label segments
then splitting using https://github.com/tyiannak/pyAudioAnal ... Extraction featureExtractionFile to generate labeled files

the loop below to train and test :

Code: Select all

for method in knn svm extratrees gradientboosting randomforest ; do 
    python3 ../pyAudioAnalysis/pyAudioAnalysis/audioAnalysis.py  trainClassifier -i labeled/* --method "$method" -o ../models/$method-classifier.out | tee "../models/${method}-train.txt"
done

svm was definitely the best.

Machine learning file segmentation

Code: Select all

for i in macro-output/*WAV ; do 
    python3 ../pyAudioAnalysis/pyAudioAnalysis/audioAnalysis.py  segmentClassifyFile -i "$i"  --model svm --modelName ../models/svm-classifier3.out | grep ': sex' | grep -v -F '0:00:00 - 0:00:01: sex' > "${i}-analysis.txt"
done

Sample results:
1. false positives (scattered short timestamps)

Code: Select all

0:01:11 - 0:01:12: sex
0:02:49 - 0:02:50: sex
0:05:06 - 0:05:07: sex
0:05:26 - 0:05:27: sex
0:07:03 - 0:07:04: sex
0:08:20 - 0:08:21: sex

2. true positives (dense timestamps, sometimes several seconds long)

Code: Select all

0:00:06 - 0:00:08: sex
0:00:10 - 0:00:11: sex
0:00:13 - 0:00:14: sex
0:00:16 - 0:00:18: sex
0:00:21 - 0:00:22: sex
0:01:10 - 0:01:11: sex
0:01:12 - 0:01:13: sex
0:01:15 - 0:01:16: sex
0:01:21 - 0:01:22: sex
0:01:25 - 0:01:26: sex
0:01:30 - 0:01:31: sex
0:01:42 - 0:01:44: sex
0:01:53 - 0:01:54: sex
0:01:57 - 0:01:58: sex
0:02:01 - 0:02:02: sex
0:02:06 - 0:02:07: sex
0:02:10 - 0:02:11: sex
0:02:13 - 0:02:15: sex
0:04:32 - 0:04:33: sex
0:04:46 - 0:04:47: sex
0:04:48 - 0:04:49: sex
0:06:44 - 0:06:45: sex

Notes

the training works only on the combination of recorder/neighbor voice I think. It would probably be quite easy to train something more advanced (resnet or sth like that) given enough samples.