[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]

Re: CPAN is getting too big



On Wed, 5 Jan 2000, Kurt D. Starsinic wrote:

> > Before re-inventing wheels, what do you think?
> 
>     rm authors/id/G*AR/perl5.00[45]*.patch.gz

On Wed, 5 Jan 2000, Chris Nandor wrote:

> Ignoring the copyright restrictions for a moment: one thing you might
> do is ignore anything that follows the binary distribution format (it
> will normally contain "-bin-" in the name), or ones that end in ".bin"
> or ".sit" or ".hqx".  I have some significantly large files in my
> directory that follow that, and they are binary distributions of
> things.  Filtering out those in my directory alone might save about
> 4-5MB.  :-)


Or in other words, we could publish a list on the top level that would
allow people to omit certain files when creating a CD of CPAN.  Kind
of like the -I include-file option to tar.

Exclude patches, binary distributions, maybe not distribute every version
of every module ever created.  Sounds like a job for perl :)

#!/usr/local/bin/perl -w

use File::Find;

open RFL '>recommended_file_list.txt' or
  die "Couldn't open recommended_file_list.txt for writing: $!\n";

select RFL;
$\ = "\n";

find(\&CPAN_grepper, '.');  ##Run from the top level of CPAN somewhere

sub CPAN_grepper {
  return if $File::Find::name =~ m#authors/id/G*AR/perl5.00[45]*.patch.gz#
     or $other_conditions;
  print $File::Find::name;  ##Print full path and name to STDOUT
}

Then that file can also be used to generate a size for "distributed CPAN"
to see if the exercise was worth it.

Of course, from where I am, it's much easier to process find-ls.gz.

Colin


References to:
"Kurt D. Starsinic" <kstar@chapin.edu>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]