GSoC 2018 Blog

"GSoC 2018 and Speech Recognition for the Flux Model Zoo: The Conclusion"

Tue 14 August 2018 in updates

#gsoc #updates #asr

Here we are on the other end of Google Summer of Code 2018. It has been a challenging and educational experience, and I wouldn't have it any other way. I am thankful to the Julia community for supporting me through this. I've learned a lot and become even more familiar …

"Those bugs that hide"

Mon 06 August 2018 in updates

#gsoc #updates #ctc #gpu

Here we are at the penultimate blog post. Things finally look like they're wrapping up! There was a key discrepancy I noticed between the code I was working from and the connectionist temporal classification (CTC) algorithm as described by Graves (2012) and how it was impelemented in Baidu's warp-ctc library …

"Update: Bug-fixing and interfacing"

Tue 24 July 2018 in updates

#gsoc #updates #ctc #gpu

We're coming into the final stretch of coding for Google Summer of Code 2018 here. The last post I made was not necessarily the most hopeful, but I hope this post will reinvigorate things a bit. I have accomplished quite a bit in these last two weeks, so let's talk …

"Coding period two update: GPU coding and gradients"

Sun 08 July 2018 in updates

#gsoc #updates #ctc #gpu

I'd like to begin this post with a word of advice: never assume that the problem you're working on is too complex to find useful information on using Google. I'm sure you're at least acutely aware of this, but reinforce it for yourself for a moment. It will be relevant …

"Second status update"

Wed 27 June 2018 in updates

#gsoc #updates #ctc

These last two weeks have been spent working on the connectionist temporal classification loss function. Last post, I mentioned that was my bugbear for this project because it is difficult to get coded and working correctly. Coding it up has been an odyssey, to say the least. It's not been …

"First status update"

Mon 11 June 2018 in updates

#gsoc #updates #architecture #ctc

We are upon the first evaluation period, so it's a good time to check in. My time so far in the Google Summer of Code program has been a joy. It's been wonderful to be able to interact with a coding community and spend a significant amount of time working …

"The network architecture"

Sat 09 June 2018 in asr

#asr #convolution #flux #code

UPDATE July 25, 2018: I have changed the network into one chain of functions instead of two that had to be managed with another function. This version should also have the kernels oriented the right way to replicate the paper. Using convolutional layers for speech data There are a number …

"Extracting speech features in Julia"

Mon 28 May 2018 in asr

#asr #speech features #code

Extracting the speech features in Julia God, finally! The code! Up until now, I have been trying to situate automatic speech recognition in the context of what we know about human speech because I believe this is important to be able to reason about the kind of data we're working …

"A high-level introduction to ASR"

Sun 27 May 2018 in asr

#asr #speech #mfcc #mel #features

In the last post, we discussed the acoustic basis of speech. In this post, we'll build on those concepts to discuss automatch speech recogntion in preparation for extracting the features we'll use as data in the next post. Automatic speech recognition Automatic speech recognition is the process of having computer …

"The acoustics of speech"

Sat 26 May 2018 in asr

#asr #acoustics #speech

In the last post, we discussed the articulatory aspects of speech. This post will build on that one and discuss the acoustic aspects of speech. The acoustic aspects of speech When speaking, a sound wave is produced. This is the air stream that is produced and modulated during the process …