Archive for the ‘Videos’ Category

Arduino PID motor speed controller – Extra fun bits (part 2/2)

Tuesday, September 13th, 2016

These are a few of the bits that I cut from the main video because it was too long, including running it at full speed and a comparison with a super-simple system! There’s a strobe light in this video, too.

You might want to watch part 1 if you haven’t already, so that this makes more sense.

[Watch in HD]

Arduino PID motor speed controller – Casual demo (part 1/2)

Monday, September 12th, 2016

I threw this together from an old toy’s motor, old printer’s iR sensor, pizza box and some other things, to try out the PID controller algorithm after discovering it on Wikipedia and seeing that there was pseudocode, meaning that I didn’t have to get a PhD in mathematics to be able to read the crazy-looking formulas that Wikipedia seems to be so fond of. There’s a strobe light in this video.

I had planned to screen-capture my program while recording but completely forgot to at the time, so please try to survive my camcorder pointing at my laptop screen…

[Watch in HD]

Here, the PID controller is trying to keep the motor at a precise speed (and get it there as quickly as possible). It doesn’t work well half the time because the L298 (H-bridge), responsible for switching power to the motor, doesn’t seem to like making the motor brake. That means it speeds up much more quickly than it slows down, which the algorithm doesn’t like (it’s designed for linear systems) – it basically ends up trying too hard to slow down, resulting in a big undershoot. I might be able to somewhat compensate for that in code.

I might try this with a Sabertooth motor speed controller (as used in my old singing motors project) in place of the L298, which can certainly force a motor to stop spinning, but the Sabertooth gives such a boost to the motor to get it up to speed that 90% of the PID’s job becomes redundant… Oh well, at least it’d be able to hit any given note without me having to calibrate it first like I did with the singing motors. By the way, that’s why this system measures speed in Hz – I originally intended for it to play music like a new kind of “singing motor”.

Originally, I planned to use a 3-pin computer fan instead of this motor, using the tachometer pin to measure the speed, but that required me to have a common ground for the motor and the tachometer, and I didn’t have the right components available (I only had N-channel MOSFETs, but I needed a P-channel MOSFET). So I ended up throwing my own motor assembly together and using an N-channel MOSFET only (could only turn power on/off, not brake), which the PID system didn’t like. I thought the L298 would fix that problem, since it’d allow the PID system to reverse power to the motor and brake it, but it turns out it’s too weak to have much of an effect after all… =/

Part 2/2 will show it running at full speed (with a more powerful PSU), show a much more naïve speed controller algorithm for the lulz, and just clear up a couple of details.

Neural Network Learns to Generate Voice (RNN/LSTM)

Tuesday, May 24th, 2016

This is what happens when you throw raw audio (which happens to be a cute voice) into a neural network and then tell it to spit out what it’s learned. (WARNING: Although I decreased the volume and there’s visual indication of what sound is to come, please don’t have your volume too high.)

[Watch in HD]

This is a recurrent neural network (LSTM type) with 3 layers of 680 neurons each, trying to find patterns in audio and reproduce them as well as it can. It’s not a particularly big network considering the complexity and size of the data, mostly due to computing constraints, which makes me even more impressed with what it managed to do.

The audio that the network was learning from is voice actress Kanematsu Yuka voicing Hinata from Pure Pure. I used 11025 Hz, 8-bit audio because sound files get big quickly, at least compared to text files – 10 minutes already runs to 6.29MB, while that much plain text would take weeks or months for a human to read.

I was using the program “torch-rnn“, which is actually designed to learn from and generate plain text. I wrote a program that converts any data into UTF-8 text and vice-versa, and to my excitement, torch-rnn happily processed that text as if there was nothing unusual. I did this because I don’t know where to begin coding my own neural network program, but this workaround has some annoying restraints. E.g. torch-rnn doesn’t like to output more than about 300KB of data, hence all generated sounds being only ~27 seconds long.

It took roughly 29 hours to train the network to ~35 epochs (74,000 iterations) and over 12 hours to generate the samples (output audio). These times are quite approximate as the same server was both training and sampling (from past network “checkpoints”) at the same time, which slowed it down. Huge thanks go to Melan for letting me use his server for this fun project! Let’s try a bigger network next time, if you can stand waiting an hour for 27 seconds of potentially-useless audio. xD

I feel that my target audience couldn’t possibly get any smaller than it is right now…

EDIT: Because I’ve been asked a lot, the settings I used for training were: rnn_size: 680, num_layers: 3, wordvec_size: 110. Also, here are some graphs showing losses during training (click to see full-size versions):


Training loss (at every iteration) (linear time scale)


Training loss (at every iteration) (logarithmic time scale)


Validation loss (at every checkpoint, i.e. 1000th iteration) (linear time scale)


Validation loss (at every checkpoint, i.e. 1000th iteration) (logarithmic time scale)

For sampling, I simply used torch-rnn’s default settings (which is a temperature of 1), specifying only the checkpoint and length and redirecting it to a file. For training an RNN on voice in this way, I think the most important aspect is how “clear” the audio is, i.e. how obvious patterns are against noise, plus the fact that it’s 8-bit so it only has to learn from 256 unique symbols. This relatively sharp-sounding voice is very close to a filtered sawtooth signal, compared to other voices which are more breathy/noisy (the difference is even visible to human eyes just by looking at the waveform), so I think it had an easier time learning this voice than it would some others. There’s also the simple fact that, because the voice is high-pitched, the lengths of the patterns that it needs to learn are shorter.

EDIT 2: I have been asked several times about my binary-to-UTF-8 program. The program basically substitutes any raw byte value for a valid UTF-8 encoding of a character. So after conversion, there’ll be a maximum of 256 unique UTF-8 characters. I threw the program together in VB6, so it will only run on Windows. However, I rewrote all the important code in a C++-like pseudocode here. Also, here is an English explanation of how my binary-to-UTF-8 program works.

EDIT 3: I have released my BinToUTF8 Windows program! Please see this post.

How to reduce the life of your print head by a week [Dot matrix]

Saturday, May 21st, 2016

I was playing around with Windows 98 drivers and found a combination of settings that printed the slowest, loudest, darkest black line I’ve ever seen this dot matrix printer print. And then a second one on top of the first one, just in case it wasn’t dark enough already.

I’d guess that that’s about a week’s worth of wear in 30 seconds.

[Watch in HD]

And now, I’m enjoying random fainter black parts in my prints because that part of the ribbon’s worn much more than the rest, lol.

(Printer is an Epson LQ-300+II)

Random Old Rubbish (Part 2)

Monday, February 29th, 2016

The second and much-shorter part, as I clear out some random rubbish in my room. There are a few more old electronic devices, including a ~25-year-old LCD game, plus some paper stuff…

[Watch in HD]

This time, I didn’t throw away or dismantle everything in the video! I did thoroughly rearrange whatever remained afterwards, though.

Server that actually looks (and sounds) like a server

Sunday, February 28th, 2016

I was recently with a friend who runs a server which is a lot more impressive than mine, so I thought I’d show it off. It also sounds like a jet engine.

[Watch in HD]

For a start, it’s actually in a rack-mount case (2U), with ~17TB total disk capacity and 20GB of RAM (it usually has about 24, but he had to remove some to use in another machine, hence the sticks of RAM lying on top of the case). It’s running a few VMs for people (with Arch Linux as the host), acting as a NAS, and doing a few other things like running some IRC bots, but he shut it down and rebooted it so that I could hear the fans rev up. =D

Random Old Rubbish (Part 1)

Saturday, February 27th, 2016

I was cleaning out my room and found a load of random stuff (mostly toys), and some of it was interesting, so I decided to record it. Some of it’s around 14 years old. I didn’t intend for 50% of the video to be about Beyblades.

[Watch in HD]

Stress-test Audio CD

Monday, February 15th, 2016

Never mind just gapless playback – let’s throw 99 tracks at various CD players (hardware and software) within 40 seconds and see how they handle it!

The music I used is “Nature’s Gasp” by Atmozfears & Devin Wild. Big thanks to Atmozfears for letting me use it on YouTube (now let’s hope that YouTube’s automatic song recognition doesn’t punish me despite that…).

[Watch in HD]

This test just naturally emerged after I played around with splitting tracks in a hardstyle mix for seamlessly playing on a CD. The trick to ensuring no silence between tracks was to split on CDDA frame boundaries (every 2352 bytes, which makes 75 per second for audio CDs). It took me some time before I realised that Audacity can measure position in CDDA frames and that I didn’t have to convert the number of samples into CDDA frames myself every time…

I don’t have a grudge against Foobar or anything – it really did get stuck in a loop of spinning up and down the CD the last time I tried it. Also, this may be my most anticlimactic and rushed ending ever.

Multi-strike, multi-pass, colour correction (WIP: Dot matrix program)

Wednesday, February 10th, 2016

Showing off a couple more things not possible with the Epson driver: multi-strike printing and “quiet” (not really) mode, along with CMYK colour correction which is nearly invisible to my camcorder, so that part was a waste of video…

This is a program I’m casually working on every now and then to print images on any 24-pin ESC/P2 dot matrix printer (ESC/P2 is Epson’s control language for their dot matrix printers). It directly controls the printer by sending raw commands to it; you just need to tell Windows that it’s a “Generic / Text Only” printer on Windows, not using the official Epson driver, and Windows will pass the commands straight on to the printer without trying to translate them.

[Watch in HD]

This is a standalone program for printing image files, not a driver for printing from any program. I’ve not yet released it, but I intend to some time. Compared to the driver, it currently allows:

  • Printing in (lower) resolutions for high speed (down to 60 DPI).
  • Detailed control over colour dithering/thresholding.
  • Very tall print-outs not restricted to a paper length (e.g. for continuous paper).
  • Printing only individual component colour(s) of an image.
  • * Faster colour printing by doing large blocks of each colour at once.
  • * Multi-strike printing (optionally offsetting each one to fill in the gaps between the earlier ones’ dots).
  • * “Quiet” (multi-pass) printing (unfortunately, I can’t control the actual speed).

*The last three are somewhat “hacks”, abusing commands to try to force unofficial behaviour, and as such, they rarely work properly in combination with each other. In particular, the last two often don’t work when printing colour.

By the way, printing in blocks of colour is no longer done by relying on sending commands with the correct timing (as it did in the previous video), which means it’s now much more reliable and doesn’t get messed-up by pausing the printer, image content, etc.

Solar Panel Microphone (Experiment)

Sunday, January 31st, 2016

Here, I conneced a solar panel (via a transformer) to a sound interface as if it’s a microphone, to reveal the subtle pulsing and filckering of various light sources. If you don’t like 50 Hz, this video isn’t for you.

[Watch in HD]

Thankfully, the infrared light from my camcorder is apparently very clean (not pulsing), so I can use that to see things in the dark without affecting the sound.

The transformer is just designed to convert 230V AC to 12V DC, so its audio properties are not very good (it muffles things a lot). Ideally, I’d be using an audio transformer that’s designed to sound good, but this is all I had available. I am using it to remove the DC current that the solar panel makes, because I don’t fancy putting 17.5V into my Quad-Capture (sound interface)’s mic input. I originally tried to make a high-pass filter to remove the DC, using a capacitor and resistor, but it only worked until the capacitor became fully-charged, at which point the sound faded. It was much clearer-sounding than the transformer, but there was also a huge amount of background noise.

I want to revisit this idea in the future, especially to take it for a drive at night, listening to the street lights and car lights (since modern cars use PWM to dim the tail lights).