[Stallman:] Developing such software would be a big job, but possible if people are dedicated. It would probably take soe [sic] years.
It might be easier if we start form [sic] the GitLab software. That is free, right?
However, I doubt we could even possibly hope to pull most free software hosting away with GitHub.
Let's suppose we do a great job of developing that software and we set up a server running it, and we want to compete with GitHub for projects to choose us. How many free projects are there on GitHub? Hundreds of thousands, I suppose.
To provide good service for that many projects, I think we would need a server farm, and hundreds of staff. We could not afford that.
We would need those staff, and rental for the server farm.
not for a one-time development expense, but as operating costs, year after year.
The only way we could do that is by charging for the service. Most projects would choose some other service which is gratis.
However, those projects that chose our service would get good service, since we could afford to give it to them, for pay.
We could make this work, but would it make a big difference?
Hi Richard,
I feel encouraged that most of your concern about a GitHub replacement is technical and economic. Those problems can be solved. The key is to use a distributed architecture.
I see five important reasons to go with a distributed git repository:
1. Distributed I/O and CPU load. 2. No single point of failure (such as a ddos attack). 3. No single site entity would have to finance and maintain a gargantuan datacenter. 4. No one country could censor the content of the repository. 5. No single entity could completely control the entire repository.
I have done some basic research and come up with a proposed technology: For the back-end the project can utilize a PostgreSQL database server utilizing Postgresql ltrees. Ltrees is a very powerful and performant database feature for tree-like data structures such as git, and it would be perfect for this application.
Putting the git data schema entirely in a database provides a secure and robust system, with transactional integrity.
Perhaps most importantly, PostgreSQL 10 has introduced a feature called "Logical Replication", through which one can perform intra-database object-level replication across hosts. This can provide an efficient and solid transactional mechanism for distributed replication.
So, the core idea is to have several sites, located and independently financed in a number of countries.
Now, would such a thing make a BIG difference? Well, like most software projects it would start out small, and then get bigger. Code from Savannah can begin to be migrated-in, making it immediately important, and then the project will certainly receive a lot of attention. I think volunteers will be eager to get on board. As other hubs are established and various and diverse Free Software projects worldwide join-in there will be a compounding function in effect. I think ultimately such a system will provide the preferred repository for Free Software, since that domain will be the focus, and will have the benefits of the distributed implementation outlined above.
It will be an easy sell, assuming the interactive user experience is competitive; people will understand the importance immediately, since Free Software folks do not want to be overseen by Microsoft.
I can come up with a more detailed functional description and system specification if you would like.
Thanks,
Tom G.