Menu

exploring scripting to help develop conceptual models from a textual corpus

May 18, 2018 - Blogging

as i start to get the hang of this blogging thing, and of wordpress, my next aim is to begin cleaning up some scripts i have been working on – text processing scripts that extact vocab from pdfs and organise it into repeating patterns.

 

im interested in taking a collection of pdfs from a writer ( usig foucault at this stage ) or a domain ( considering using books on  liberalism and republicanism ) to identify vocabulary, phrasology and patterns of expression – then to use this to develop concpetual models of their subject domains.

The aim is to distill the logic of a subject domain and to present this as a conceptual model.

I have had some sucess doing this informaly, just through listening and writing, but i feel that a more rigorous approach might lead on to a method and might help improve the quality of my modelling generally.

I have found over the years that data models for all their rigour, don’t quite hit the mark in getting people to understand the dynamics of a domain, and that process models are equally ineffective. the aim is brevity, saliency…. i will wax elequent on a desiterata for this domain in future posts, but for now, just noting that the quest to automate some of these steps and to formalise the pattern matching will at the least help me to understand what makes a good model…

hear endeth the rant.