Package org.apache.tika.parser.txt
Class TXTParser
java.lang.Object
org.apache.tika.parser.AbstractEncodingDetectorParser
org.apache.tika.parser.txt.TXTParser
- All Implemented Interfaces:
Serializable,org.apache.tika.parser.Parser
public class TXTParser
extends org.apache.tika.parser.AbstractEncodingDetectorParser
Plain text parser. The text encoding of the document stream is
automatically detected based on the byte patterns found at the
beginning of the stream and the given document metadata, most
notably the
charset parameter of a
HttpHeaders.CONTENT_TYPE value.
This parser sets the following output metadata entries:
HttpHeaders.CONTENT_TYPEtext/plain; charset=...
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionSet<org.apache.tika.mime.MediaType>getSupportedTypes(org.apache.tika.parser.ParseContext context) voidparse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) Methods inherited from class org.apache.tika.parser.AbstractEncodingDetectorParser
getEncodingDetector, getEncodingDetector, setEncodingDetector
-
Constructor Details
-
TXTParser
public TXTParser() -
TXTParser
public TXTParser(org.apache.tika.detect.EncodingDetector encodingDetector)
-
-
Method Details
-
getSupportedTypes
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context) -
parse
public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException, org.apache.tika.exception.TikaException - Throws:
IOExceptionSAXExceptionorg.apache.tika.exception.TikaException
-