Jason,
Hmm, interesting... After using your "special tool", can you still open,
read and edit all of the "rich" (doc/rtf/ppt/xls/pdf) and html documents
with the original applications, i.e., winword.exe, excel.exe, powerpnt.exe
and Acrord32.exe?
If so, then you should be able to store these "rich" documents in a column
with an IMAGE datatype and have then correctly FT Indexed the IMAGE column..
If not, then you best bet is "strip" out the raw text from these
altered/rich documents and put the text in a column defined with either
varchar or TEXT (depending upon size) and then place the documents in an
IMAGE column and FT Index the varchar or TEXT column.
Even if you don't use FTS, the above recommendations are good for all
readers of this fulltext newsgroup.
Regards,
John
Quote:> Hello,
> I have hundreds of "rich" (doc/rtf/ppt/xls/pdf) and html documents.
> using a special tool, "rich" documents are marked by us and html documents
> are parsed and stripped (to textual content, without tags or code).
> After the tool has finished, we plan to insert each file content into the
> database.
> I'm wondering what is the best database design for this?
> should I use two columns (text and image) to store textual (parsed)
content
> and the rich files seperatlly? or can I use a text field for the "rich"
> files as well?
> I have no plan to use ms-sql full text search, just store the "rich" files
> AS IS in the database for further processing.
> Thanks!