Breaking CAPTCHA in Python

Usually CAPTCHAs are analyzed by using neural network, it's a good approach, but it may be overcomplicated in simple cases. Presented below, much shorter algorithm can produce sufficient results for uncomplicated CAPTCHAs.

The algorithm is simple, image with unknown letter is compared with samples of known letters. A letter in the most similar sample is probably also the letter in analyzed image. It was implemented as a Python script, it's usage is presented below:

captcha breaker in python, sample of usage
bash-3.2$ python cracker.py test1.png 
e
other sample of usage of the script for breaking CAPTCHAs in Python
bash-3.2$ python cracker.py test2.png 
p

Please notice that this script can't be directly used on a raw CAPTCHA, firstly small artifacts should be removed from the CAPTCHA, secondly each letter should be stored in a separate image.

Edge Detection by using Genetic Algorithms

Abstract

There are plenty of algorithms for searching edges in images, in this post I will present a variant that uses genetic approach.

Edges found by using a genetic algorithm (100 generations):

output image produced by genetic algorithm applied to search edges in the image

Original image:

input image used in edge detection based on genetic algorithm

Virtual Machine implemented as a Domain Specific Language in Ruby

Abstract

Some time ago I created a simple programming language that was compiled to its own assembly language, a file in this assembly language could be executed in a Virtual Machine or be compiled to an executable for. Today I will present an idea how to use Domain Specific Language (DSL) created in Ruby to load and execute the assembly file just as a regular Ruby script.

Implementation

A sample of an assembly file is presented below, it's intuitive except the "int" mnemonic, it means "interruption", int 0 pops element from the stack and prints it on STDOUT. There aren't any registers, everything is done by using a stack.

bar:
    int 0
ret

foo:
    call bar
    ret

main:
    push 13
    call foo
    ret

Motion estimation in super-resolution algorithms

Abstract

One of important steps in super resolution algorithms is the estimation of initial constants, without this the algorithms will produce useless higher resolution images.

From mathematical view, we may say that during taking a photo, the original scenery is gathered in high resolution, with some movement between each image (e.g. shaking hand or satellite moving on its orbit), next psf and the noise is added, at the end the image is downscaled.

motion estimation in python, used for super resolution algorithms

To obtain higher resolution we need to somehow revert this process, to do that we need to estimate mentioned movements. In this post I will present two simple algorithms that I used to estimate the movement between images.

How to prevent this program from printing anything?

Place something instead of blanks to prevent below program from printing anything.

#include <stdio.h>

int main() {
    if (     ) {
        printf("foo");
    }
    else {
        printf("bar");
    }

    return 0;
}

German tank problem

Sometimes science is applied in really amazing ways, one of examples is German tank problem. During World War II, the Allies observed that the Nazis created a new model of tank. Total number of manufactured units was unknown, but the Allies, know serial numbers of some tanks. In this case, the first produced tank has serial number starts = 1, the second has serial number = 2, etc. How the Allies could estimate total amount of manufactured tanks?

Answer may be estimated by using simple equitation. Let's assume that N = estimated total amount of enemy tanks, k = amount of observed tanks, m = maximum observed serial number. Then:

N = m - 1 + m/k

Example

We observed enemy tanks with serial numbers: 20, 23, 45, 47, 48. In this case, m=48, k=4, answer can be computed on calculator or for example in R language (I really like to use it as a powerful "calculator"):

> 48 - 1 + 48/4
[1] 59

Other use cases

The same as used for German tank problem was used to estimate amount of sold Apple's products. The customers were asked to upload their serial numbers.

Zapper built from NAND gates

Zapper, what's that?

Zapper is a simple square signal generator that is plugged to the human body. It's believed that this can kill parasites, and that parasites (and pollutions) are the source of many or even all of the illness.

Results

I think, that this can work, maybe that's my imagination or autosuggestion, but I feel better, I have more energy and I'm more optimistic.