List All Assets

I was asked how I choose to name files. Here I collect data on asset files I have in many sites and try to make sense of my own conventions.

# Assets

Find all my sites based on owner.json files.

find .wiki -name owner.json \ -exec grep -l "ward.cunningham" {} \; > xx

Convert to asset directory path.

perl -pe 's/status\/owner.json/assets/' xx > xy

Repeat find for each asset directory.

cat xy | while read d; do find $d -type f; done > xz

# Paterns

We can abstract these names by converting spans of letters and numbers to a, A, or 0. Then count uniques.

cat xz |\ perl -pe 's/[a-z]+/a/g' |\ perl -pe 's/[A-Z]+/A/g' |\ perl -pe 's/[0-9]+/0/g' |\ sort | uniq -c | sort -n

The winner are probably all from trail photos with names like "spring-2022/IMG_1234.jpeg".

1071 .a/a.a.a.a.a/a/a/a-0/A_0.a

We refine this to focus on the base file names and further recognize hex strings as alternating specific letters and digits which we show as ##.

cat xz |\ perl -pe 's/.*\///' |\ perl -pe 's/\d*([a-f]+\d+){2,}[a-f]*/##/ig' |\ perl -pe 's/[a-z]+/a/g' |\ perl -pe 's/[A-Z]+/A/g' |\ perl -pe 's/[0-9]+/0/g' |\ sort | uniq -c | sort -n

Now all patterns occurring ten or more times.

11 0.0,-0.0.A 12 AaAaAa.a 13 0a.a 13 a-a-0.a 13 a.a.a.a 14 Aa_a_Aa.a 14 a_a.a 17 a-0.0.a 18 0-0-0.a 21 a-a-a-a.a 22 0-0-0-a.a 22 Aa_Aa.a 31 0-0.a 35 a_a_0.a 36 A.a 37 a0.a 45 Aa.a 53 Aa Aa 0-0-0 a 0.0.0 A.a 60 a.a.a 61 A_0.A 81 a-0.a 100 a-a-a.a 194 0-0-0-0.a 261 a-a.a 439 a.a 462 ##.a 1128 A_0.a

# Curiosity

We'll trace back some of these patterns to the assets and maybe even the sites that host them. To this end we move from the command line to html scripts. github

http://ward.dojo.fed.wiki/assets/pages/list-all-assets/tally.html HEIGHT 600

pages/list-all-assets