About varocarbas.com

--

About me

--

Contact me

--

Visit customsolvers.com, another side of my work

--

Valid markup

--

Valid CSS

--

© 2015-2017 Alvaro Carballo Garcia

--

URL friendly

--

Optimised for 1920x1080 - Proudly mobile unfriendly

R&D projects RSS feed

All projects in full-screen mode

PDFs:

Project 10

Project 9

Project 8

FlexibleParser code analysis:

UnitParser

NumberParser

Tools:

Chromatic encryption

(v. 1.3)

Pages in customsolvers.com:

Upcoming additions

Failed projects

Active crawling bots:

Ranking type 2

(
)

Currently active or soon to be updated:

Domain ranking

FlexibleParser (DateParser)

NO NEW PROJECTS:
Project 10 is expected to be the last formal project of varocarbas.com. I will continue using this site as my main self-promotional R&D-focused online resource, but by relying on other more adequate formats like domain ranking.
Note that the last versions of all the successfully completed projects (5 to 10) will always be available.
Completed (24 days)
Completed (57 days)
Completed (26 days)
Completed (47 days)
Completed (19 days)
Completed (14 days)
Domain ranking
Domain ranking

Objective, from-scratch, backlink-based web domain ranking built on the "everything is connected" idea (i.e., all the listed domains are somehow connected to the starting one). It only includes a few restrictions which are strictly required to output generally valid conclusions. Examples of restrictions: ignoring links between same-name domains or penalising groups of similar domains getting most of their backlinks from other members of that group (a quite common scenario which has a particularly negative impact on the accuracy of a high-quality-small-dataset-based approach like the current one). Another relevant issue is that the cancelled Project 1 can be considered the precursor of this ranking; to know more about Project 1, visit the corresponding customsolvers.com page.

This ranking only relies on the information retrieved by a set of crawling bots which I have personally developed and which navigate through internet by applying the aforementioned approach (i.e., current domain linked by the previous one). At the moment, only the ranking-type-2 bots are actively collecting information. I might perform some manual corrections in the bot-generated outputs, but only to improve the overall system reliability.

The outputs of this ranking are improved with each iteration (zero-based), which is defined as follows:
  • Building a big enough preliminary ranking (stage-1). In the iteration 0, the bots performed a simpler and completely unrestricted analysis.
  • Building the main ranking (stage-2) by weighting the backlinks on account of the position of the given domain in the stage-1 ranking.
  • Once the stage-2 ranking gets big enough, it becomes the stage-1 basis for the next iteration.
The reliability of this ranking is highly conditioned by the iteration number (stage-1 information quality) and the duration of the analysis (number of domains/backlinks under consideration). Note that the transition between iterations is a delicate process which can affect reliability; the online updates might be paused during some days after starting a new iteration. The top positions are always more reliable than the bottom ones, where the recently-found domains are located. The 1-10 dependability score below gives a good idea about the quality of its information at each point.

This is a software-focused system which intends to use minimal hardware resources; at the moment, it only uses the following:
  • Crawling bots, main storing and synchronising applications: local desktop computer with 4 cores (2.4 GHz) and 3.8 GiB of memory.
  • Online search functionality: varocarbas.com resources, as defined by the MDDHosting basic plan.
  • Backups: over 5 hard drives in different locations.
I will update this page to reflect any relevant variation of the conditions of this system, online search functionality, my expectations or similar. Additionally, I will also be regularly posting in the log of the associated researchgate.net project.

IMPORTANT NOTE: the sole purpose of this ranking is to promote my software development skills and related issues (e.g., attitude at work). I have created the whole system (i.e., ranking algorithms, crawling bots, storage/backup/sync subsystems, etc.) completely from scratch and am the only person dealing it (+ optimising/debugging/extending its functionalities). The results of this ranking are automatically generated by a set of applications built to deliver the objectively best outputs. I might perform corrections to ensure the reliability of the conclusions, never to intentionally benefit/damage anyone.


Domain ranking search
This functionality allows to search through the last domain ranking version. It is URL-friendly and supports the following input scenarios:
  • When inputting an individual full/partial domain name or URL, the highest-ranked match is returned. Exact matches will be always preferred.
    Examples: twitter.com or facebook.
  • When writing "top" or "first" ("bottom" or "last") [number of records up to 250], the corresponding top (bottom) domains are returned.
    Examples: top 50 or last 3.
  • To get a certain number of records starting/ending at a given position, write that position, a blank space and the range size (up to 250).
    Examples: 1 50 or 20 -5.
Updated every 6 hours
--
2344143 domains (stage-1: 1000000)
--
Iteration 2
--
Dependability 3/10
Top 250:
/
/
/
251-500:
/
/
/
Bottom 250:
/
/
/