[WWdevel] Re: CVS alternative

Thu Jan 13 23:55:33 EST 2005

On Jan 10, 2005, at 5:47 PM, John Jones wrote:

> Sam Hathaway wrote:
>
>> On Jan 10, 2005, at 4:13 PM, Michael Gage wrote:
>>
>>>>
>>>>  Problems start in the non-tagged side, basically however we find 
>>>> them.  Once this thing is initialized, I guess we can start filling 
>>>> that up with lots of pg files.  When it gets tagged, then it is 
>>>> moved to the tagged-side, which will be organized to mirror the 
>>>> heirarchical topic structure of the database.
>>>>
>>>>  We may not be able to "polish" every problem, but as that is done, 
>>>> it simply gives an updated version of the problem on the tagged 
>>>> side.  The setup as described above basically gives up on the 
>>>> notion of systematically polishing problems.  If we want to keep 
>>>> that alive, we should have 3 basic sub-divisions (raw, tagged, and 
>>>> tagged-and-polished).  Actually, this 3-part version might be a 
>>>> good way to go.
>>>>
>>> I like the 3 part version.  Possibly even a 4th part for  problems 
>>> which can be used as models for future problems (exhibiting best 
>>> practices, etc. etc.)  This fourth part could be fairly small 
>>> however, and may not need to be a CVS.
>>
>>
>> Can anyone give me more details on how the repository of problem 
>> sources and the "database" will interact? Based on what little I 
>> know, it seems to me that the problem source should be part of the 
>> problem's database record.
>
> In a sense, things are reversed.  The problem files initially contain 
> all of the information; the extra information comes in the form of 
> special comments in those files.  We then have a script to set up a 
> mysql database, and to extract the information from the files and load 
> it into the database.

Would it be fair to say that the MySQL database does nothing more than 
act as an index on the metadata associated with each problem? Or am I 
missing something?

> I like this approach since it is easy to reload the database if 
> something goes wrong, and we are shipping mainly flat text files 
> (except for the images).

I like the simplicity of this, and in a distributed system like this 
the more we can do with a version control system the better.

>> By the way, has anyone thought about how problems will be packaged? 
>> Many problems consist of more than one file and it might be worth 
>> laying out a packaging format, so that a problem and all of its 
>> auxiliary files and metadata can be distributed as a single file.
>
> I hadn't thought of the extra files.  Thus far, the problems were 
> basically not packaged in any special way.
>
> The distribution method I had in mind was that webwork would handle it 
> behind the scenes.  It would fetch files over http from perl (I think 
> the perl module is LWP, or something like that).  The entry point 
> would be an extra tab in the admin course (along with add course, ..., 
> and then Problem Database).  If you ask it to update your Problem 
> Library Database, then it fetches the current list of files/version 
> via http, checks it against your current list, and gets whatever is 
> new and reloads the database.

Shouldn't we leverage the version control system checkout features to 
fetch and update problem libraries? It seems like a waste to keep the 
problems in CVS (or Subversion) and then ignore the versioning features 
of that system and track versions separately and fetch via HTTP.

> My guess is that this is how the perl cpan module works, and it is how 
> the xemacs package system works.

By the way, CPAN modules are packaged in "distributions", tarballs 
which have a predictable naming scheme and layout and a standard way to 
build and install them.

> Since knowing which files need updating keys off of version numbers, 
> we may have to keep those as part of the files' metadata.

Would that still be a problem if you were to keep the local copy of the 
problem database as a checked-out CVS (or Subversion) working copy?

> If we use cvs for the files, we can just use the cvs revision number.  
> If we use subversion, then there is one number for all files, so every 
> change would make it look like all of the files need updating, so that 
> would be a case where a problem's version number would have to be 
> kept.

I don't really know, but I would expect Subversion to provide some way 
of identifying a version of a particular file. I know that it has the 
concept of a changeset, and that might be more like what you want 
anyway. Each changeset would encompass a small set a files, usually a 
single problem file but sometimes a problem and its auxiliary files.

> This approach should still be ok with extra associated files.  They 
> are listed in the manifest along with the problem files.  So, if you 
> don't have one at the time of an update, it will be fetched for you.

What is the manifest? I don't think you'd need any such thing if you 
were to use a version control system to track files.

Thanks for explaining this all to me. If you get sick of it, just let 
me know. I always have opinions about things that aren't really my 
business, but if you'd like to be left alone, say the word. :)
-sam