Skip to content

Supported File Types

This guide covers all file formats supported by Voxifi and best practices for each type.

Supported Formats

FormatExtensionsMax SizeBest For
PDF.pdf50 MBProduct guides, policies
Word.docx, .doc50 MBFormatted documents
Plain Text.txt50 MBFAQs, scripts, simple content
Markdown.md50 MBStructured documentation
CSV.csv50 MBPricing, product data
TSV.tsv50 MBTab-separated data
JSON.json50 MBStructured data
YAML.yaml, .yml50 MBStructured configuration data
XML.xml50 MBStructured data, feeds
Log.log50 MBReference logs, records

Format Details

PDF Files

Best for: Product catalogs, user manuals, official policies

Requirements:

  • Must be text-based (searchable), not scanned images
  • No password protection

Tips:

  • Use OCR on scanned documents before uploading
  • Remove unnecessary images to reduce file size
  • Ensure text is selectable (not just embedded images)
Good: Product manual with searchable text
Bad: Scanned paper document (image-only PDF)

Plain Text (.txt)

Best for: FAQs, call scripts, simple reference content

Requirements:

  • UTF-8 encoding

Tips:

  • Use clear headings with === or --- separators
  • Keep answers concise
  • One topic per section

Example structure:

Q: What is your return policy?
A: We accept returns within 30 days of purchase with original receipt.

Q: How long does shipping take?
A: Standard shipping takes 5-7 business days. Express is 2-3 days.

Markdown (.md)

Best for: Structured documentation, formatted guides

Tips:

  • Use headings to organize sections
  • Lists work well for multi-part answers
  • Tables are supported

Example:

markdown
## Pricing Tiers

| Plan | Monthly | Annual |
|------|---------|--------|
| Basic | $29 | $290 |
| Pro | $79 | $790 |

## Features

### Basic Plan
- 100 calls/month
- Email support
- 1 AI agent

CSV Files

Best for: Product pricing, inventory data, structured information

Requirements:

  • Header row required
  • Standard comma-separated format

Tips:

  • Include descriptive column headers
  • Use consistent formatting

Example:

csv
product_name,sku,price,stock_status
Widget Pro,WP-001,$49.99,In Stock
Widget Basic,WB-001,$29.99,In Stock
Widget Enterprise,WE-001,$99.99,Contact Sales

Word Documents (.docx, .doc)

Best for: Existing business documents, formatted policies

Requirements:

  • No macros or forms

Tips:

  • Use built-in heading styles
  • Avoid complex tables or embedded objects
  • Convert to PDF if formatting is critical

JSON Files

Best for: Structured data, nested information

Requirements:

  • Valid JSON syntax
  • UTF-8 encoding

Example:

json
{
  "products": [
    {
      "name": "Widget Pro",
      "price": 49.99,
      "features": ["Feature A", "Feature B"]
    }
  ],
  "policies": {
    "returns": "30 days with receipt",
    "warranty": "1 year limited"
  }
}

YAML Files (.yaml, .yml)

Best for: Structured configuration data, product catalogs in YAML format

Requirements:

  • Valid YAML syntax
  • UTF-8 encoding

XML Files

Best for: Structured data exports, product feeds

Requirements:

  • Valid XML syntax
  • UTF-8 encoding

File Size Optimization

Reducing PDF Size

  1. Remove unnecessary pages
  2. Compress images
  3. Use PDF optimization tools
  4. Export without embedded fonts

Reducing Text File Size

  1. Remove redundant content
  2. Summarize lengthy sections
  3. Split into multiple focused files

Content Best Practices

For AI Agents

Structure content so the AI can find and use it:

DoDon't
Clear Q&A formatLong narrative paragraphs
Specific, factual informationVague or ambiguous content
Current, accurate dataOutdated information
Consistent formattingMixed styles throughout

Information Density

  • Include only what the AI needs to answer questions
  • Remove marketing fluff and filler content
  • Focus on facts: prices, policies, specifications

Update Frequency

Review and update files when:

  • Prices change
  • Products update
  • Policies change
  • Seasonal information changes

Troubleshooting

"Unsupported Format"

Convert your file to a supported format:

  • .xls/.xlsx.csv
  • Scanned image → OCR to .pdf or .txt

"File Too Large"

Reduce file size:

  • Split into multiple smaller files
  • Compress images in PDFs
  • Remove unnecessary content

"Failed" Status

Common issues:

  • Password-protected PDF
  • Corrupted file
  • Non-UTF-8 encoding in text files
  • Image-only PDF (no extractable text)

Next Steps

Voxifi - AI-Powered Voice Assistant Platform