Google app engine datastore tag cloud with python -
we have unstructured textual data in our app engine datastore. wanted create 'one off' tag cloud of 1 property on subset of datastore objects. after around, can't see framework allow me without writing myself.
the way had in mind was:
- write map (as in map reduce) function go on every object of particular type in datastore,
- split text string words
- for each word increment counter
- use final counts generate tag cloud third party software (offline - suggestions here welcome)
as i've never done before, wandering if firstly there framework around me (please) of if not approaching in right way. i.e please feel free point out gaping holes in plan.
feed tagcloud , pytagcloud 2 possibilities.
feed tagcloud generator gadget google app engine might fit needs. unfortunately, it's undocumented. fortunately it's rather simple, though i'm not sure how well-suited needs.
it operates on feed, , appears flexible, if have feed of site, might not trouble integrate, though processing online.
pytagcloud worth look. you'll able processing offline, , generates rather handsome clouds.
all you'll have working, export datastore; counts , splitting done you, pytagcloud can operate on text files. following instructions in app engine docs uploading , downloading data show how export datastore local machine. you'll want write "exporter class", , have pytagcloud operate on output.
if decide roll own, want skip online processing , use offline method of uploading , downloading data above, unless want dynamically-updated cloud. iterating on entire data store, , doing online counts annoying , expensive part of task. makes sense if want or need dynamic tag-cloud. above, i'd recommend writing "exporter class", , operating on locally.
Comments
Post a Comment