Jump to content

Data usage optimisation.


Marc S

Recommended Posts

Hi.

When I started using Plastic few weeks ago, I was pleased to see my usage on the cloud server was less that the assets on my hard drive but this has shifted the other way since. I understand plastic also stores previous versions of my files: if I save often update a 1GB psd file for example, it's fair that it uses more than 1GB on the cloud. That said, I know I don't need to keep previous versions of files like lighmaps data and was wondering if there are some optimizations I could use so I don't end up using 3x my local data on the cloud.

Is that possible?

Thanks

Link to comment
Share on other sites

Hi Marc,

 

Thanks for the suggestion.

 

What you says is:

  • Your working copy size is smaller than repo size. Well, this is normal, as you pointed, the repo contains ALL revisions, while you only have a working copy. Repo stores revisions compressed, that's why initially it weighted less.
  • You'd like to have a way to "discard history and keep only last N revisions" for certain files, correct?
Link to comment
Share on other sites

Hi Psantosl and thanks for that quick reply.

Yes that is correct.

A coworker and I are currently working on a large game level. The scene file is 130MB and it's lightmap 1.7GB. We re-bake the lighmap once or twice a week and it's only on the cloud server so it's shared on our computers. We don't need versioning on it since it can always be re-baked. The scenes are sent back and forth to each other all the time so they can be uploaded +5 times a day which is more than we need for long term backups. As you can see, this can pile up quickly for minor changes.

The lightmap is easy to deal with. Unity creates a file always called "LightingData.asset" so all we would need is a rule for these files to only keep N revisions as you suggested.

For files like my scene, above a certain weight, it's a bit trickier. A solution would be to have an other rule to discard chosen assets after x days, but because we might need to get it back to a previous version after that date, we could set an exception with a milestone. A milestone would be a "super-changeset" where that rule doesn't apply.

In practice here is how it would go. We're working as usual and set an occasional milestone each week when we're satisfied with our work. At this point we can always go back to any changeset because recent data is kept on it as long as it is not a lightmap. Six month later, a client asks us to revert some changes on that scene we were done with: most of the scenes data was erased after x days to save space but we have the milestones which is enough for what we need.

 

I understand my request might be unusual. I'm working on a big unity project to make multiple mini games that share the same assets so I'm not creating a new repository each time a create a new one. That means the data I'm currently pilling up could follow me for the next years and would cost me a lot on the long run. So while it's not urgent, I hope we can find a solution to avoid hoarding too much data.

Thanks

Link to comment
Share on other sites

7 hours ago, Marc S said:

A coworker and I are currently working on a large game level. The scene file is 130MB and it's lightmap 1.7GB. We re-bake the lighmap once or twice a week and it's only on the cloud server so it's shared on our computers. We don't need versioning on it since it can always be re-baked. The scenes are sent back and forth to each other all the time so they can be uploaded +5 times a day which is more than we need for long term backups. As you can see, this can pile up quickly for minor changes.

My question is: why do you checkin the lightmap if it can be rebaked? Because it is easier to share it through Plastic SCM than actually using other mechanism, correct?

 

7 hours ago, Marc S said:

The lightmap is easy to deal with. Unity creates a file always called "LightingData.asset" so all we would need is a rule for these files to only keep N revisions as you suggested.

Understood.

 

7 hours ago, Marc S said:

For files like my scene, above a certain weight, it's a bit trickier. A solution would be to have an other rule to discard chosen assets after x days, but because we might need to get it back to a previous version after that date, we could set an exception with a milestone. A milestone would be a "super-changeset" where that rule doesn't apply.

Of course this means you are in single branch, right. How would this work with multiple branches?

The "super changeset" can be simply a labelled one. You know, you can label changesets.

So the rule would be something like:

Keep only 3 versions of LightingData.asset.

Keep only N versions of files > that size except if they are labelled.

 

7 hours ago, Marc S said:

I understand my request might be unusual. I'm working on a big unity project to make multiple mini games that share the same assets so I'm not creating a new repository each time a create a new one. That means the data I'm currently pilling up could follow me for the next years and would cost me a lot on the long run. So while it's not urgent, I hope we can find a solution to avoid hoarding too much data.

I'm actually very interested on this topic, because it makes a lot of sense for us to add extra features for teams using Unity.

 

I'll add more people to the conversation.

Link to comment
Share on other sites

I think it would be great to be able to optimize for storage, but I think it's also important that the history of a project remains stable. If you remove the data from a changeset, it would be impossible to switch to that changeset and view the project as it was at that point in time. Not all changesets are that useful, though, as evidenced by uploading several alterations of a photoshop document in a single day, so as long as it's done in an orderly fashion, I'm happy. :)

 

On 3/20/2019 at 9:50 AM, psantosl said:
On 3/20/2019 at 2:23 AM, Marc S said:

For files like my scene, above a certain weight, it's a bit trickier. A solution would be to have an other rule to discard chosen assets after x days, but because we might need to get it back to a previous version after that date, we could set an exception with a milestone. A milestone would be a "super-changeset" where that rule doesn't apply.

Of course this means you are in single branch, right. How would this work with multiple branches?

The "super changeset" can be simply a labelled one. You know, you can label changesets.

So the rule would be something like:

If you look in the branch explorer and view "only relevant" changesets, all the changesets that disappears I'd say are eligible for removing the data from. So all inbetween-y changesets after some point should have their data deleted can be a general rule.

Link to comment
Share on other sites

  • 2 weeks later...
On 3/20/2019 at 9:50 AM, psantosl said:

My question is: why do you checkin the lightmap if it can be rebaked? Because it is easier to share it through Plastic SCM than actually using other mechanism, correct?

Yes. I come from Unity's "Collaborate", a pseudo VCS tool used to work as a team. If I'm not mistaken, it only kept 10 last versions of each files and that was enough for us.

Understood.

 

Of course this means you are in single branch, right. How would this work with multiple branches?

I'm not sure about that. I don't have an extensive branches experience.

The "super changeset" can be simply a labelled one. You know, you can label changesets.

I did not but I'm checking this out. Thanks

So the rule would be something like:


Keep only 3 versions of LightingData.asset.

Keep only N versions of files > that size except if they are labelled.

Yes, that would work. Though we would need a label preset option to avoid human error.

I'm actually very interested on this topic, because it makes a lot of sense for us to add extra features for teams using Unity.

I'll add more people to the conversation.

 

I'm glad this is welcome. I think you have some business to gain from unity's users since they screwed up badly with their own service (Collaborate). It was so neglected and full of bugs that it became unusable. I subscribed to your service in an emergency and believe you should offer a unity preset plan because that's what I was looking for.

Link to comment
Share on other sites

  • 1 year later...

Hi.

Now that you've been bought by Unity, my feature suggestion might be less of a niche market.

There already some user concerns about this because Collaborate has a different policy for Data storage. https://forum.unity.com/threads/announcement-plastic-scm-joins-unity-here-is-the-collaborate-to-plastic-scm-migration-guide.953252/#post-6222609

I hope that features can be done. 😀

Thanks

Link to comment
Share on other sites

  • 3 weeks later...
  • 3 months later...

I'm exactly what Marc S is mentioning - a Unity Collaborate user switching over to using Plastic SCM.

I didn't know that 'history' counted towards data storage, and I'm now worried that I won't be able to afford this service. 

Are there any options to delete/limit history for files or filetypes? We're using Wwise for audio, and we need the full Wwise project to be inside of the repo because multiple team members work on it. If we are making tweaks to audio files throughout the day, aren't the history-versions of those files going to add up to be a huge amount of storage?

I know that there are options to 'ignore' files/filetypes but we need to maintain a single most-recent version of the audio files. I'm not really sure what to do.

 

Edit: I just wanted to add a bit more info for my particular situation, and in case I am misunderstanding some of the basics here - we have a total of 4 members on our team, and currently our Unity project is about 3GB. I would have no complaints for paying for up to the 100GB data plan, if that meant not having to worry about things. What I don't fully understand is what happens with all of the history-versions of these binary files (audio assets, PDFs, etc) - are they kept forever? If so, I imagine the project would just continuously grow, and easily surpass 100GB. Am I wrong in this understanding? Would I have any control over how long (how many revisions ago) previous versions are kept for?

Thanks a lot for any help, and sorry if I'm misunderstanding anything!

 

Edit 2: Bonus question -  I stumbled upon the PlasticSCM plugin for Wwise. The only information for it I could find was in the patch notes here: https://www.plasticscm.com/download/releasenotes/8.0.16.3937 Would it be recommended to use a separate repo for the Wwise Project itself, and to use this plugin?

Link to comment
Share on other sites

Hi,

Quote

Edit: I just wanted to add a bit more info for my particular situation, and in case I am misunderstanding some of the basics here - we have a total of 4 members on our team, and currently our Unity project is about 3GB. I would have no complaints for paying for up to the 100GB data plan, if that meant not having to worry about things. What I don't fully understand is what happens with all of the history-versions of these binary files (audio assets, PDFs, etc) - are they kept forever? If so, I imagine the project would just continuously grow, and easily surpass 100GB. Am I wrong in this understanding? Would I have any control over how long (how many revisions ago) previous versions are kept for?

In the Plastic Cloud plans, you pay for the storage based on the database size. If you create multiple revisions for the same file, they will be kept all in the database. 

Hopefuly, we can provide soon a good solution soon to reduce/trim the size of the databases. It's in our roadmap. Previously on this thread, there are some good proposals/suggestions.

Quote

Edit 2: Bonus question -  I stumbled upon the PlasticSCM plugin for Wwise. The only information for it I could find was in the patch notes here: https://www.plasticscm.com/download/releasenotes/8.0.16.3937 Would it be recommended to use a separate repo for the Wwise Project itself, and to use this plugin?

You can use this plugin for any repo. You don't need to crete a separete repo to use the plugin. if you have any problem installing/using the Wwise plugin, please open a new thread and we will assist you with the process.

Sorry for the inconveniences,

Carlos.

Link to comment
Share on other sites

On 12/28/2020 at 6:00 AM, calbzam said:

Hi,

In the Plastic Cloud plans, you pay for the storage based on the database size. If you create multiple revisions for the same file, they will be kept all in the database. 

Hopefuly, we can provide soon a good solution soon to reduce/trim the size of the databases. It's in our roadmap. Previously on this thread, there are some good proposals/suggestions.

You can use this plugin for any repo. You don't need to crete a separete repo to use the plugin. if you have any problem installing/using the Wwise plugin, please open a new thread and we will assist you with the process.

Sorry for the inconveniences,

Carlos.

Hey Carlos! Thanks for the reply - I didn't get an email notification (I'll check my account settings and see if I can enable notifications), but luckily I came back to the forum to ask another question haha.

That would be great if there was some way to trim the database. Thank you guys for looking into that!

I'm trying to think of options for what I could do in the mean time. Looking at Ryxali's reply above - is deleting old/irrelevant changesets actually okay to do? Or can it screw things up? 

If deleting old/irrelevant changesets isn't an option, then is the only other option to reduce storage size to start a new repo of the current project, and delete the old one?

Thanks for the help!

Link to comment
Share on other sites

Glad to see this feature starts getting some attention and I hope we can get some news about it from the Plastic team.

One thing I'd like to address is that we need to keep the meta data. The main reason I don't delete my old repo and start fresh is that, while I'll unlikely to need the data after a year, I could still need the full history for legal reasons.

Keeping all the logs is like having a dashcam for work. It can save a lot of trouble if there is some kind of disagreement and should be kept a few years.

Link to comment
Share on other sites

Following multiple discussions about that feature and my experiences with Plastic and Git over the last years, I've listed what I think we'd need in order of importance

The main feature would able us to clean changesets that aren't merges or labeled (like the "only relevant" feature looks like).

Important

  • Block that feature from non manager users.
  • Keep all logs forever.
  • Be able to revert that process during the following month so it's not risky to try it out.
  • Limit cleanup to changesets older than x days.
  • Add rules* so some files aren't affected and keep their history no matter what. 
     

Less important.

  • Show the cleaned changesets in the main interface with a 50% opacity to avoid confusion.
  • An option to delete previously deleted big files from history. (for example, if I replace a big TGA texture with an identical but lighter PNG)
  • Automate the process so the cleanup is done after x days or x changesets.
  • Add rules* so some files never have a history or only x versions.
  • Allow users to cleanup a branch once they are done with it.

*rules based on file name, extension, size, parent folder...

Pricing would be based on how much data is effectively available each month. So for example, if the feature was used to cleanup my repo today, I'd pay the usual price this month an the next since the backup will still be accessible, and then see the effect on the price once that data is gone. I would then see my bills go up if I've worked on big files but this would only be a temporary bump and not a permanent cost.

 

Here are some default settings I would recommend for Unity users. Feel free to add suggestions.

  • Always keep history : files that aren't in the /Asset folder.
  • Always keep history : .cs .prefab .txt .meta .json.
  • Always keep history : files < 50KB
  • Never keep history : files named "LightingData.asset" or .psd.
  • Delete from history after x days if deleted. File > 50MB and .tga , .wav, LightingData.asset.

 

I hope we can see that happen soon. :)

Link to comment
Share on other sites

Adding @mariechristineb to this thread to track and evalute the feedback.

Quote

 

I'm trying to think of options for what I could do in the mean time. Looking at Ryxali's reply above - is deleting old/irrelevant changesets actually okay to do? Or can it screw things up? 

If deleting old/irrelevant changesets isn't an option, then is the only other option to reduce storage size to start a new repo of the current project, and delete the old one?

 

In Plastic, you can only delete a changeset if it's not the parent of other changeset.

Regards,

Carlos.

Link to comment
Share on other sites

Popping into this thread (after posting about a related one), because we're getting into that territory. We're baking lightmaps for one project that's in the gig range. It takes a couple of hours to rebake the lightmaps right now (we're trying to figure out how to speed that up, but we're already using Bakery). With that amount of time, it's desirable to have them checked into VC. But the size (and related fees) is concerning.

Link to comment
Share on other sites

  • 8 months later...

Hi,

I just checked the status with the product team. Some major features are taking a bit longer than expected but this topic is a high pritority thing.

I will keep pushing to be scheduled as soon as possible as I know is a major request from our customers.

Sorry for the inconveniences,

Carlos.

  • Like 1
Link to comment
Share on other sites

1 hour ago, calbzam said:

Hi,

I just checked the status with the product team. Some major features are taking a bit longer than expected but this topic is a high pritority thing.

I will keep pushing to be scheduled as soon as possible as I know is a major request from our customers.

Sorry for the inconveniences,

Carlos.

Thanks Carlos!!

Link to comment
Share on other sites

  • 6 months later...

Thanks @calbzam.

I saw it but have been too busy to risk trying it through command prompt.

That said I'd be happy to try it as soon as there is a UI. A simple "right click on a branch -> Archive this branch" would be a very good start.

Do you guys have that somewhere in the Alpha UI?

Link to comment
Share on other sites

1 hour ago, calbzam said:

Some GUI based (or more user friendly options) to reduce the storage are on the roadmap under design at the moment. But in the near future, I'm afraid the only option is using "cm archive".

Regards,

Carlos.

Awesome, glad to hear this! I'm also a bit worried that I'd screw something up using command line. My ideal use-case would be for a GUI based way to select a file in the Workspace Explorer, select Archive, and then select a Date Range (or maybe a Changeset Range). For me personally, I have two files that I want to archive all versions that are older than, lets say, 3 months. So every couple months I could just come in and archive all versions that are older than 3 months that I haven't already archived.

I was looking at the documentation in the link you posted, and I believe this is something that would be possible with the archive command? Seems that it is "per file" rather than "per branch"?

Link to comment
Share on other sites

Hi,

Quote

I was looking at the documentation in the link you posted, and I believe this is something that would be possible with the archive command? Seems that it is "per file" rather than "per branch"?

The "cm archive" command extracts file revision data from the repository. The command requires one or more file revision specs:

cm archive --help

Regards,

Carlos.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...