Typically called "document analysis"
Heritage: hermeneutics, graphical design
Scope: One document type at a time
Reuse focus: identify "boilerplate" content and repeating structural elements
Heuristic rather than formal techniques
Descriptive "text encoding" to capture idiosyncratic aspects of instances
Typical textbook: Maler and Andaloussi. Developing SGML DTDs: From Text to Model to Markup (1996)