Pattern Matching Best Practices

BEST PRACTICES:

For testing templates and pattern matching parameters (e.g., threshold), use playlists of less than 500 recordings. 

To create a useful playlist to test the performance of templates and pattern matching parameters, we recommend manually validating a subset of recordings. 

○ Our science team usually manually validates (i.e., mark the presence/absence of species in the audio recordings) all recordings from 1-2 days at each sampling site. 

● Do not run a pattern matching job with a large playlist (>500 recordings) until you are confident in the template and pattern matching parameters. We expect that this new feature will greatly assist your project. But, the excessively redundant analyses of this feature could lead to a reduction of speed in your analyses. 

PATTERN MATCHING GUIDE: 

Check out our Pattern Matching articles for tutorials on how to use this revolutionary feature. Additionally, we strongly suggest checking out our WildLabs Tech Tutors talk for an in-depth tutorial on Pattern Matching. 

Additional tips from our science team

● The first step in acquiring species information from the audio recordings consists of manually validating a subset of recordings.
Our science team usually manually inspects and validates species in all recordings from one to two non-consecutive days from each sampling site. 

● During the manual validation of a subset of recordings, our science team also includes Tags named "Good template_ species name" that identify potential good templates for each focal species. 

o Explore how to create tags here

o We consider good templates when the sound of interest has a high signal-tonoise ratio and does not overlap with other sounds. 

o After manually validating a subset of recordings, you will probably have several recordings with Tags "Good templates_species name" for some species; you can then compare them and select one to be used in the pattern matching approach. 

 ● The preliminary species list can offer you a first glimpse of species that occur in your study area and allow you to test the performance of the pattern matching models. Once you finish the manual inspection and validation of a subset of recordings, create a playlist with all detected species. 

o To learn how to create a playlist, visit our tutorials here

o Then, create a pattern matching job that will run over the preliminary species list (< 500 recordings) to test the performance of pattern matching models. Select templates and pattern matchings that result in more detections of your focal species. Once you are confident in the template and pattern matching parameters, run a new pattern matching over recordings from all sampling sites. 

o Select templates and pattern matchings that result in more detections of your focal species. 

o Once you are confident in the template and pattern matching parameters, run a new pattern matching over recordings from all sampling sites. 

● Our science team usually creates a diurnal and nocturnal playlist with all sampling sites for the final analyses. This procedure will increase analysis speed and avoid excessive false positives. 
For example a nocturnal playlist for an owl species will reduce the number of false positives and optimize the analysis. The user can test whether this species has activity in the daytime playlist with a smaller set of recordings. 

● Species-specific identification models, such as the pattern matching models, can be used to detect species in the audio recordings. RFCx Arbimon Pattern Matching is a template-based detection algorithm tool. 

o The tool takes a template selected from an acoustic signal (i.e., the vocalization of a species) set by the user and searches within a user-defined playlist of recordings for signals that match the template. 

o This procedure detects time-localized signals with a correlation equal to or greater than a threshold assigned by the user.

o The correlation is computed in the spectrogram (time-frequency) domain (LeBien et al. 2020).

o Results from pattern matching models are presented in a grid view for posterior validations. 

● Choosing the proper parameters used in the Pattern Matching templates will depend on your ecological questions. 

o Our science team usually uses the pattern matching approach to understand where the species occur in the landscape and what influences where a species is found.
Therefore, we typically select a low (0.1 - 0.3) threshold and one match per recording. 

o Using a low threshold, the pattern matching algorithm will detect anything slightly similar to your template. 
Therefore, you will be able to detect natural variations of the species call, but as a tradeoff, you will also detect many false positives.

o Since it is so easy to manually validate the pattern matching outputs and exclude the false positives using the grid view in our platform, we still prefer using a low threshold in our ecological analyses. 

● Our science team usually uses the pattern matching results in the occupancy models approach to understand where the species occur in the landscape and what influences where a species is found. 

o Therefore, we usually use the filter "Best per site/day" to validate the pattern matching results manually. 

o This procedure significantly reduces the number of recordings that need to be validated.