Excel at processing data from a single site
Data formats are consistent
File structure is consistent
Most common data analysis tool:
Analyzing one site: easy
Analyzing many sites: tedious copy+paste
"Everyone has their own NVSPL reader function"
New scripts often mean new file-reading code
Mostly ad-hoc solutions
Dealing with accessing many files is still taking time and effort
Great at processing data from one site routinely
Good at analyzing data from one site routinely
Still working on analyzing data across many, many sites routinely
Python library for accessing any subset of the whole natural sounds dataset
Once you write this:
You can easily do this:
import soundDB
sites = ["DENAUPST2015", "DENAFANG2013", "DENAWEBU2009"]
srcids = soundDB.srcid.all(sites)
# now the contents of the SRCID files for those three sites
# are loaded into one DataFrame, called srcids
# analyze as you wish
nvspls = soundDB.nvspl.all(sites)
metrics = soundDB.metrics.all(sites)
listening = soundDB.audibility.all(sites)
dailyPAs = soundDB.dailypa.all(sites)
loudevents = soundDB.loudevents.all(sites)
import soundDB
sites = ["DENAUPST2015", "DENAFANG2013", "DENAWEBU2009"]
srcids = soundDB.srcid.all(sites)
srcids.to_excel("allSrcIDs.xls")
# saves all 3 srcID files
# concatenated into one Excel workbook
Use metadata to find sites that match some criteria
Load all their data with soundDB
Eventually: integrate with Metadata Database
Analyze
Would this be used?
By whom? On what data?
Are they really to learn Python, decently well?
Is it better as a standalone tool, or a programming language library?