CrowLeer, the fast and reliable CLI web crawler with focus on pages download

16 Dec 2017 / ERap320

I recently decided to release a personal project of mine on GitHub. The name is CrowLeer and you can find it here.

In the last year I worked for a customer which needed a software capable of extracting particular data from a bunch of public websites' pages. I was ready to write the code for the recognition and storage of said data, but couldn't find any existing crawler that fit my needs. They come in all shapes:

  • Some offer a lot of very useful SEO data but can't download pages
  • Others have a download feature but lack the granular control needed to avoid downloading or following a great number of irrelevant pages
  • The ones which can download and have proper control over the flow of the crawling lack reliability or a proper way to be integrated with other software

I ended up using one of the previously mentioned "unreliable" ones (with loads of ad-hoc middleware) and called it a day, but months later decided to create my own as a personal project.

CrowLeer was created with simplycity, control and interfaceability in mind. You can find all the details in the GitHub page on the top of the article. I have plans to greatly expand its features but I already find it much more functional than many of the competitors I've worked with.

How to run graphical applications with the Windows Subsystem for Linux

17 Jul 2017 / ERap320

First of all you have to activate the Windows Subsystem for Linux from Turn Windows features on and off, which you can find with a simple search from the start menu.

Download, install and start Xming, our substitute for the X server usually found on linux. This component will render the windows of the GUI program we'll throw at it.

At this point you have to install a program with a graphic interface. For this article I'll use a text editor named gedit, but you can use pretty much everything that comes to your mind. I was even able to succesfully run xfce4 straight from the default repository.

sudo apt-get install gedit

After the installation is completed, we "link" the shell to Xming by executing

export DISPLAY=:0

If you want to avoid having to write the same command every time you restart bash, you can just append it to .bashrc, found in your home directory.

Now you just jave to run the program by writing its name, like this:

gedit

Making shortcuts for Linux programs

You can make shortcuts to start GUI Linux programs straight from your desktop, using the "bash -c" command.

The command of the shortcut is:

bash -c "export DISPLAY=:0; [[[PROGRAM NAME]]]"

Since bash -c doesn't run the content of the user's .bashrc it needs the first line to specify the display before every other command is executed. Of course you have to make sure Xming is running every time you try to open these shortcuts.

How to use the old Photo Viewer in Windows 10

06 Jul 2017 / ERap320

Download this .reg file and run it.

The next time you try to Open with... an image file, Photo Viewer will show up as a choice.

You can also set which files it will manage from Control Panel > Programs > Default Programs.

How to recover Grub after a Windows installation

05 Jul 2017 / ERap320

In an administrative Windows' cmd prompt:

bcdedit /set {bootmgr} path \EFI\ubuntu\grubx64.efi

If the Linux distribution isn't Ubuntu you can just swap the name of the directory.

If you don't know the exact name of the directory find a way of mounting the EFI partition (it's just FAT32) and check what's in there.

The beginning

05 Jul 2017 / ERap320

Many usefull things out there, but too much stuff to remeber. Maybe they could be also useful to you, so I'll drop them here.



begin ... prev 1 2