I recently decided to release a personal project of mine on GitHub. The name is CrowLeer and you can find it here.
In the last year I worked for a customer which needed a software capable of extracting particular data from a bunch of public websites' pages. I was ready to write the code for the recognition and storage of said data, but couldn't find any existing crawler that fit my needs. They come in all shapes:
I ended up using one of the previously mentioned "unreliable" ones (with loads of ad-hoc middleware) and called it a day, but months later decided to create my own as a personal project.
CrowLeer was created with simplycity, control and interfaceability in mind. You can find all the details in the GitHub page on the top of the article. I have plans to greatly expand its features but I already find it much more functional than many of the competitors I've worked with.