I resurrected a bunch of old projects done for either fun or while job hunting:
This is a demo made to breath life into a mockup screenshot I found online. Don’t remember where I got the mockup from but I like the vibe :)
This is a simple shapes application written in Typescript/Angular. It lets you: create/delete shapes, select and transform existing shapes, and saves everything by default so if you refresh the page it should restore your scene.
This is a demo to try to imitate Super Mario Bros character controls as close as possible. Written using Typescript/PhaserJS in Cloud9 (when it was usable). I didn’t intend for it to be a full game.
Through October, myself and my brother worked to remaster an old game of mine, Zamron Encounter. We had two goals:
- Porting game graphics to use hardware accelerated Stage3D in Flash. The game used Flash/AS3 software rendering for graphics which is awfuly slow and buggy.
- My brother wanted to re-make game art for fun and so far he did a great job!
Technically the re-master is 95% done, there are two bugs left to resolve (albeit tricky ones) and we can re-release a much better version of Zamron.
Artistically, my brother finished level art and a bunch of other stuff. Still working on monsters and player textures.
There is a larger goal to eventually start a new project to remake Zamron from the ground up using a modern engine into a full product. The re-mastering effort serves two purposes: having a usable prototype to base the remake on, and developing a work pipeline between us two.
Will talk more about Zamron’s remaster progress in the future.
Here’s a sneak peek for what we currently have:
Currently working on: Game Off 2019
Game Off 2019 is a relaxed 30 days game jam that just started a few days ago that me and my brother are participating in. The theme for this one is: Leaps and Bounds. So far we have a few concepts in progress and I hope by next week we’ll be able to show some progress.
A while ago I used a simple benchmark to very roughly compare performance on multiple platforms: PC-6002 vs 80s Computers Benchmark and got some interesting results.
Recently I got a Raspberry Pi 4 and wanted to figure out how its new CPU compare to other platforms so I went back to that simple benchmark I used, scaled it up by 1000x and used it in many different ways on many different devices and platforms. I think the results are noteworthy :) but it’s still just for fun, this is by no means a benchmark that should be taken seriously.
The simple bench I used looks like this:
import math import time import os all_primes =  t1=time.time() skip=False for i in range(2, 100000): skip=False k = math.floor(math.sqrt(float(i))) + 1.0 for j in range(2, int(k)): k1=i/float(j) k2=int(k1) if k1==k2: skip=True break if skip: continue all_primes.append(i) elapsed=(time.time() - t1) print("Prime count = " + str(len(all_primes))) print("Python Time=" + str(elapsed))
Here are the systems I ran simplebench on:
- Raspberry Pi 4: Cortex A72 1.5GHz, Raspbian
- Raspberry Pi 3: Cortex A53 1.2GHz, Raspbian
- Raspberry Pi Zero W: ARMv6 1GHz, Raspbian
- Desktop PC: AMD Ryzen 5 1600 3.2GHz, Windows 10
- Desktop PC: AMD Ryzen 7 3700X 3.6GHz, Windows 10
- Mini PC: Intel Pentium 4415U 2.3GHz, Kubuntu 18.04
- Laptop: Intel Core i5-4258U 2.4GHz, Kubuntu 18.04
- Laptop: Intel Core i7-8550U 1.8GHz, Kubuntu 18.04
- Laptop: Intel Pentium 4415Y 1.6GHz, Windows 10
- Laptop: AMD E-450 1.6GHz, Debian
- Mobile: Snapdragon 855 2.84GHz+1.78GHz, Android 9
- ShieldTV: nVIDIA Tegra X1 2GHz, Android 8
Here are the runtimes I used to run the benchmarks on:
- GCC 7+ (alternatively clang) with -std=c++14 and -O3 flags
- Lua 5.1+ and luajit
- NodeJS 8
- Godot 3.1
- Python 2.7
- termux for Android devices
I wanted to put minimal time into this so I didn’t try to run everything on every platforms, just what’s easily doable.
For each measured time, I ran the simplebench script/binary more than 10 times and took the shortest achieved time:
|RaspberryPi4/Cortex A72 1.6GHz||0.61s||4.3s||0.14s||0.061s||0.045s||0.136s||0.087s|
|RaspberryPi3/Cortex A53 1.4GHz||1.191s||10.22s||0.34s||0.157s||0.092s||0.49s||0.163s|
|Laptop/Core i5 4258U 2.4GHz||0.22s||1.2s||0.041s||0.018s||0.015s||0.045s||0.025s||0.637s|
|Laptop/AMD E-450 1.65GHz||1.17s||7.2s||0.210s||0.110s||0.079s||0.233s||0.152s|
|Mobile/Snapdragon 855 2.84GHz||0.36s||1.68s||0.013s||0.020s||0.758s|
|Laptop/Core i7 8550U 1.8GHz||0.18s||1.0s||0.037s||0.012s||0.011s||0.053s||0.018s||0.004s|
|Laptop/Pentium 4415Y 1.6GHz||0.46s||3.65s||0.094s||0.020s||0.030s||1.411s|
|Desktop/AMD Ryzen 5 1600 3.2GHz||0.22s||1.38s||0.031s||0.008s||0.011s||0.011s||0.725s|
|Desktop/AMD Ryzen 7 3700X 3.6GHz||0.156s||1.26s||0.026s||0.007s||0.008s||0.014s||0.496s|
|ShieldTV/Cortex A57 2.01GHz||0.745s||4.73s||0.049s||0.012s||0.056s||1.496s||0.012s|
|Laptop/Atom x5-z8350 1.44GHz||0.998s||8.78s||0.433s||0.095s|
|Mini/Intel Pentium 4415U 2.3GHz||0.269s||1.52s||0.043s||0.014s||0.014s||0.063s||0.016s||0.007s|
Mobile phone processors are catching up to laptop processors very quickly. I’ve read that Snapdragon 855 is similar in performance to a current gen core i3, and these results confirm that. It’s especially impressive that Snapdragon 855 is almost exactly matching a desktop PC Ryzen 5 1600 in Godot/GDScript and very close in Lua and Python.
CPython is so incredibly slower than everything else which comes as a no surprise, some of my old CPython/PyGame games struggled to hit 60 fps on laptop processors at the time without some sort of just-in-time compilation thrown in (back then I used Psyco)
Pypy runtime (which is a decendent of psyco, full jit compilation) is impressively quick in comparison to CPython, it almost matches mono actually, it’s a surprise pypy hasn’t become the dominant python runtime yet! I think it ought to be.
Lua and LuaJIT
As expected, lua is very fast for a fully interpreted language. I wish CPython was closer to that.
LuaJIT is damn impressive. Actually really close to native C++ performance! Which is insane.
The anomaly here is NodeJS which uses the excellently optimized Google V8 engine. Not only did it match but actually surpass native C++ performance on several platforms. I have no explanation other than blaming it on timer precision? but I’m not surprised its performance is that good as it not only runs 100% of the web, but a growing list of desktop/mobile applications like this very editor I’m using to write these words now.
It’s worth noting that the startup time when running
node simplebench.js is almost as slow as compiling the C++ version. This indicates some hardcore jit compilation taking place before actual execution of the script starts.
Godot’s GDScript is not as bad as I thought it’ll be, about 2x CPython. Alone it would make Godot a terrible solution for bigger games, but luckily Godot allows C++ modules to be used for critical bits and Mono/C# support is almost ready for prime time. I’m hoping at some point in the future they decide to reimplement GDScript to compile to Mono in the future.
Weather not being so nice through this weekend with an upcoming quick 2 weeks vacation where it’s summertime, I thought I’ll do a fun indoors thing on Saturday.
A while ago I set up TIC-80 on NVIDIA Shield TV and hooked up a bluetooth keyboard and mouse, the idea is to do some light gamedev directly on the TV while chilling on the couch. It’ll be like the most relaxed form of development possible :)
I thought I’ll try something very simple, a minimal tetris written as straight forward as possible was what I went with. I enjoyed it thoroughly and how easy and fast developing it on lua was. It only took me a few hours to finish covering all features I wanted. I think this might be the fastest game I’ve ever finished to date!
When I bought my Amiga A500 from TradeMe it came with a compatible monitor which had some old-monitor problems related to picture quality. I expected that as CRT monitors do tend to deteriote in quality over the years and also tend to die suddenly.
I was not proven wrong as the monitor did die a few weeks later with a click and a whine.
The Amiga A500 has a video-out port but puzzlingly it outputs in greyscale only.
I searched for how to connect the Amiga to a modern monitor using either RGB or HDMI and found that the only indirect way to do it is using an ugly commodore Amiga adapter device called A520 which provides an RCA video-out signal in color as well as RF out.
Someone was selling a pair of these on ebay so I got them, they both worked but one of them seemed to need some maintenance as it required a bit of fiddling around when plugged in to output in color. Sometimes it insists on only outputing blurry greyscale.
I ran the output to an LCD TV using RCA video and audio out, both produced awful blurry quality picture that made it very difficult to read any text. Since then the Amiga was all but unusable.
Modifying A520 to output S-Video
I had to decide what to do with the Amiga (either sell it or find a way to get good quality output making it usable again), when doing a quick search I found this great step-by-step guide for converting the A520 to output S-Video signal which should make it usable again.
I decided to try it. Ordered all the electronics components needed 2 weeks ago and when they arrived setup a work area and spent a day going through all the steps. After many hours and 3 solder-iron burns I got it done. I just needed to test it.
S-Video to RCA Video Out
S-Video output means I get two signals out of the modified A520 one called Chroma (letter C) and one called Luma (letter Y for some reason). It wasn’t clear in the guide how to convert that to a single RCA video-out signal, there are commercial S-Video to RCA converters but I felt since I went this far might as well try adding the conversion to the circuit.
Upon googling, I was surprised that this conversion requires a single component and it’s extremely easy to do! Just a single capacitor 470 pF across the Chroma and Luma outputs.
Did a quick breadboard test and voila! It got video out to display on TV:
The difference was very clear even through my phone’s camera:
Adding that capacitor to the output then rewiring and reseating the board, and we have a modified A520 ready for use:
I left it running for several hours to make sure everything is working as it should
SPRITESR is a work-in-progress sprite library to automate drawing and animating sprites. SPRITESR is written to work specifically with N66 SR Mode 6 Screen 2 (Warka PC-6002, NEC PC6001 Mk2 SR, NEC PC-6601 SR, and NEC Mr PC).
To run the demo in the emulator, insert floppy disk image, go to mode 6 then type:
SPRITESR provides 8 sprite slots that auto-redraw on VSYNC at half rate (30 fps).
Sprites coordinates can be dynamically updated and actual bitmap data can be changed to allow multi-frame animations.
SPRITESR works best for 8x8 pixels sprites but there is no hard limit, it can draw any size of sprite defined with two conditions: height must be divisible by 2, and width must be divisible by 4.
However for larger than 8x8 sprites flicker may occur.
SPRITESR allows specifying which sprites were changed this frame by calling the procedure SPRITESR_SETREDRAW. It will only fully redraw changed sprites and skip clearing unchanged sprites (useful for minimizing flicker slightly).
The library is currently made up of the following modules:
- n66sr_bios.asm: definitions for a number of N66 SR BIOS calls (addresses) and useful system constants.
- vsync.asm: vsync user event setup utility.
- fastclear.asm: fast screen 2 clearing procedures covering all 3 areas of VRAM.
- spritesr.asm: fast sprite drawing in direct VRAM accessible rectangle in screen 2 which starts at 0, 12 and has width/height of 256x188. PNG60 was updated with new functionality to support exporting p6bmp binaries directly usable by SPRITESR.
The demo is made up of the following modules:
- picard.asm: actual demo code setting up all sprites, reading input, and updating sprites accordingly.
- sprites.asm: definitions for a number of test sprites with different sizes.
SPRITESR is currently written entirely in z80 assembly and doesn’t make use of COMPILE60 tool yet.
I went through multiple attempts to implement fast sprites, for the life of me I still can’t figure out what’s wrong with VRAM in SR screen 2.. for some reason writing directly to rectangular area 0,0 to 255,11 or 256,0 to 320,200 just produces garbage despite the memory map clearly being marked for those areas in all information I found.
The VRAM layout is also odd. Each 2 horizontal pixels are directly equal to 1 byte but these pairs of pixels are arranged in 8 pixel blocks (4 bytes of memory) representing 4x2 pixels on screen:
When drawing sprites, rather than doing it pixel by pixel it’s done by block where every 4x2 pixels make 1 block which means width and height of sprites must be divisible by 4x2.
This produces very fast drawing approaching hardware sprites performance although things get costly quickly the larger the sprites get. As a consequence of drawing blocks of pixels the animation must be done in steps of 2 pixels for both horizontal and vertical movement.
For the garbage areas at the top and right there are two ways to write pixels correctly:
- Output Y coordinate to 2 special ports: 0xce (low) and 0xcf (high) and read/write pixels from/to X coordinate 0 to 320 mapped directly to address 0x0000.
- View screen 1 (text screen) and set screen 2 as work area. When writing data directly to screen 2 vram in this case it works flawlessly but you can’t see screen 2 in the meantime so this isn’t useful.
First method always works perfectly but is noticeably slower as 2 bytes need to be output for every horizontal line.
The missing piece in the puzzle was how exactly to eliminate or at least minimize flicker?
In N66 mode you get multiple pages and the ability to set which page is viewed instantly and also set which page to work on and video memory will update accordingly.
This allowed a straight forward double buffering drawing technique, no need to interrupt on VSYNC and time drawing correctly.
In N66 SR you only get a single graphics page, so double buffering is no longer possible.
While searching for information about the platform online I ran across TinyYarou’s blog where he writes about his projects for the P6 platform, in one particular blog post he demonstrates a project he did in N66 SR BASIC and briefly mentions something about VSYNC interrupt usage.
I could not find any information I could use about what’s that interrupt’s address, nor how to set it up and use it. I sent him a message through twitter and he replied with a perfect assembly code snippet showing exactly what I’m looking for!
I immediately put together a demo and it worked! no more flicker or at least a lot less compared to just directly redrawing sprites.
This became the basis for SPRITESR’s current sprite drawing and animating method. vsync.asm module is mostly based on that code snippet.
- Due to how VRAM in SR screen 2 work, sprites are limited to rectangular area starting at 0, 12 with width/height = 256x188.
- Sprites are cleared using background color specified, meaning the background must be a solid color.
- Sprite overlap is problematic as background clearing will overwrite sprite pixels below.
- No transparency is supported.
- Sprite movement horizontally and vertically can only be 2 pixels at a time due to the VRAM layout.
- Allow supporting simple dithered background clear colors.
- Attempt implementing some sort of transparency with minimal performance impact.
- Implement simple overlap detection between user-marked sprites.
- Simple tile map support, each tile is 16x16 pixels and may be used for clearing sprite backgrounds if performance impact was acceptable.
- Tile map scrolling support using the ROLL hardware scroller available in SR, single direction only horizontal or vertical.
VSYNC Module (vsync.asm)
- SETVSYNCEVENT: start vsync event
- ENDVSYNCEVENT: end vsync event
VSYNC event automatically calls event: VSYNCEVENT (implemented in SPRITESR)
FASTCLEAR Module (fastclear.asm)
- CLEARSCREEN2: slow clears all screen 2.
- Set clear color: register C (0x0 to 0xf)
- FASTCLEARSCREEN2: fast clears sprite rectangle in screen 2 @ 0, 12 (256x188).
- Set clear color: register C (two 16 colors 0x00 to 0xff)
- CLEARSCREEN2TOP: slow clears screen 2 top area.
- Set clear color: register C (0x0 to 0xf).
- CLEARSCREEN2RIGHT: slow clears right area of screen 2.
- Set clear color: register C (0x0 to 0xf)
- FASTCLEAR: fast fill memory using given value.
- Start address: HL
- Size high byte: D
- Size low byte: B
- Clear value: C
SPRITESR Module (spritesr.asm)
- Sprites data init and update: SPRITESR_DATA0 to SPRITESR_DATA7
- SPRITESR_SETREDRAW: mark sprite updated and should be redrawn this frame.
- Use sprite index: register A (0 to 7)
- SPRITESR_REDRAWALL: mark all sprites updated (full redraw) this frame.
- FASTDRAWSPRITE: draw a sprite exported by PNG60 for mode 6 (-6 -f flags).
- Sprite data address: HL
- X coordinate: C
- Y coordinate: B
- FASTCLEARSPRITE: clear sprite are using given color.
- Sprite data address: HL
- X coordinate: C
- Y coordinate: B
- Clear color (two pixel colors 0x00 to 0xff): A
Compile SPRITESR as a usable z88dk library. Implement the same bouncy sprites demo in C.