This is a project which I have been planning to complete for a while now. I am glad that I finally had some bandwidth available to tackle it.
In this project, I downloaded some of my favorite piano arrangements from MuseScore as .MIDI files and trained several Recurrent Neural Network (RNN) architectures to model the note data. The music composition problem is actually framed as a relatively simple question: given the previous note(s), what is the next note?
By training the network on my favorite piano scores then asking it to predict future notes from a randomized starting note, novel music can be generated. The results can be heard above.
I earned my second Featured Contributor Badge () for my detailed Wolfram Community post (available here) !
I am super happy with how this project turned out, and I am excited for the future of human/AI collaboration for music composition.
Pluralsight has some fantastic video courses on writing clean, production-ready code, and it is a wonderful resource for quickly learning new tech stacks (like Unreal Engine, web development, FPGA pogramming, and even CAD and PCB design). I definitely have relied on it to quickly skill up when solving new problems.
For most of the technologies on their platform (mostly programming languages), they have little IQ tests that users can do to see how they compare to their peers.
They even have a button that will automatically brag on Twitter after you take a test:
Of course there is going to be noise in any such test, but it is a decent, quick way to see where you are at in your learning journey for a particular technology.
I have used the YOLO algorithm several times in the past for robotics projects and competitions. This is a popular neural network design for performing object detection (a type of task where your goal is to draw bounding boxes around targets in an image).
The neural network architecture itself is not very difficult to build, but the loss function used for training this network is quite complicated and typically requires access to very low-level features of neural network libraries.
Since I have been learning Mathematica / Wolfram Language (WL) recently, I thought it would be a good exercise to attempt to build my own YOLO implementation using their neural network library.
The WL neural network library is built on top of the Amazon-backed MXNet framework, but the WL interfaces to the framework are extremely high-level. Quite often, I would define functions using standard WL code and let the library compile it down to neural network architecture.
But this "ease" came at a pretty steep price. There were several functions that WL was unable to compile to a valid neural network architecture - forcing me to build really complex NetGraphs manually. Additionally, I ran into many problems with invalid (NaN) gradient values caused by difficult-to-pinpoint computational sections of the network. To find what was causing these, I had to disable parts of the network until the problem stopped happening, then make modifications that would stabilize the gradient.
One example was computing the Intersection-Over-Union (IOU) for bounding boxes. For some reason, using the computed IOU score inside the loss function would cause NaNs in the gradients, but I found using a thresholding operation (x_thresholded = Floor[x*100]/100) stabilized the gradient.
Another example was using a categorical cross-entropy loss for class scores ended up causing NaN gradients, so I replaced it with Smooth L1 loss, and this resulted in a significant increase in gradient stability.
Despite running into these bugs, I was able to implement the training algorithm and finish the first version of the WolfDetector Library. In the library, I use a pre-trained YOLOv2 network from the Wolfram Neural Network Repository to reduce the amount of data required to learn to detect new objects. Additionally, I implemented Pascal VOC dataset loading in WL, as this seems to be the most common format for small datasets and open-source data labeling tools.
I also earned my first Featured Contributor Badge () for the Wolfram Community post I made when I was done!
This project has been a lot of fun. There is an open-source 3D-printable robot arm design called the Dexter, created by Haddington Dynamics, but the PCB for the robot was using an unnecessarily complicated 12-layer board with hundreds of components.
So my friend, Thomas Fagan, and myself decided we would make our own controller board that was waaaaay cheaper to fabricate. A few hours in Autodesk Eagle, and we have our own beautiful 2-layer board that is functionally equivalent to Haddington's monstrosity.
Additionally, to reduce the cost further we use off-the-shelf, ultra-quiet Trinamic stepper driver boards (commonly used in 3D printers) and standard connectors. In the end, we can build our board for a total BoM costing around a hundred dollars (compared to the hot $1.5k+ Haddington charges for the part). Sure, our board is a bit bulky, but it gets the job done!
Welcome to my website! My name is Alec Graves, and I am a robotics software developer with a passion for Machine Learning. I am skilled in may different technologies from C++ to CAD, but my focus is developing highly extensible and reliable software systems. Feel free to stay for a while and read my reflections on personal projects.