MeloTranscript web service API launched

As we already mentioned in the previous blog post, we recently launched our MeloTranscript web service API at Music Hack Day Barcelona. This blog post provides some more information on that subject.

First of all, let's restate what this thing can do: it's a web service for transcribing monophonic melodies (recorded in an audio file) into a sequence of notes. Basically, you give it a sound file with some singing, and the web service gives you back a start time, end time, average pitch, and amplitude for each detected note in the sound file.

singing_in_microphone.jpg
(photo by Julián Rodriguez Orihuela, source: Flickr, license: CC BY 2.0)

The usage workflow entails a few simple steps:

First, you need to upload the sound file you want to have processed. Stereo recordings will be converted to mono.
This must be a recording of a monophonic melody: a single person singing or humming. A full mix of several instruments won't work: this thing processes monophonic melodies only. So, it doesn't make sense to try this with full songs from your music collection (unless it's a solo a capella of course).

Once the sound file has been received by our server, it will be added to a processing queue, ready to be analyzed by a pitch detection and note segmentation algorithm (originally developed at Ghent University). This algorithm will do two things: it will find the starts and ends of notes in the melody, and assign a frequency/pitch and amplitude to each note.

The server will then respond in the form of a json or XML file (whichever you prefer), which contains the info to access the results.
Several text-based result types are available: a simple tab-separated text file, a json file and a label file that can be opened in Audacity. We also provide a wave file that contains a rendering of the detected notes with simple sine tones, so you can listen to what was detected. In addition, we provide a MIDI file with the frequencies rounded to the nearest notes in an A440 reference tuning.
Note that given the free tuning of the human voice, this MIDI file result will only work well if the person's reference frequency is also 440 Hz (for example, when he/she is singing along some background music). We might add some reference tuning correction to the API later on. This remark is only relevant for the MIDI file result; for the other result files, no rounding is performed.

Technically speaking, our MeloTranscript service is a RESTful web service, which makes it easy to integrate with web sites, servers and networked desktop/mobile applications.

To register for an API key, browse the documentation and try it out: http://api.samplesumo.com is the place to be.

If you are interested in commercial use, please contact us.
If a web API is not what you are looking for, and prefer a standalone processing tool or a library for integration in your own projects, we can discuss this too.

Comments

This looks good, but need to test a bit more.

Sure, feel free to tell us what you find: http://www.samplesumo.com/send-us-message

Add new comment

Newsletter

Subscribe below to receive the latest news from SampleSumo.

Subscribe to Blog feedRecent blog posts