Jump to content

VCS based on DB vs Non-DB


EdDev

Recommended Posts

Hi

 

We'll to start with all at least your sources, but you also have even more files in the .git directory where git does it's bookkeeping.

 

Adding a website to git of 140MB with 278 files in 23 folders led to a '.git' directory of 117 MB with 295 files and 181 files. And after the first commit the filecount is up to 325 files in 194 dirctories and 117MB. For each commit the numbers will probably grow (but i'm not sure about that) as will the size.

 

In PlasticScm it would be 1 file/database + 1 for bookkeeping for all repositories (talking about MSSqlCE, other databases create different number of files). At least easier to backup and with modern SQL Servers more fault tolerant than a file system based system.

 

Others like CVS and SVN are not much better (and good old sourcesafe also delivers a large amount, my projects with fit into around 80 files in PlasticScm (one per project) would take up 17,820+ files in SourceSafe. Each file no matter how small takes a cluster of diskspace so lots of small files can take up huge amounts of space. 

 

regards

Link to comment
Share on other sites

  • 1 month later...

I would think the major advantage of the DB is indexing (Performance!). 

 

Lets take a search example. If you want to find change sets for a particular version, branch or label, each file has to be scanned to see which of those files contain the version/branch or label you are looking for - this can be slow with the file based repository.  With the database, those changes could be tagged with a key describing the change, then the key is applied (and indexed) to different tables where the changes are committed.  That indexing can speed up the process of checking out particular versions.

 

Lets face it - databases were created to make data handling very fast - if that were not so, they would have never been created.  All "databases" would be file system based.  Lots of files to search = lots of time for the disk heads to seek and read.  With databases, those structures are pretty static (defined), so you can skip huge chunks of data rather than seek/scan to find where to start.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...