importing 21 gb text file

importing 21 gb text file

Post by Imra » Sat, 26 Apr 2003 17:36:12



I am trying to import a tab delimited text file of size 21
gb with an estimated 63million records.  I do not have a
multi-processor machine.  Only 4 or five 2.0 GHz Pentium
IV that can network.  Two machines have about 30 Gbs to
spare and the other two or three could probably spare 10-
12 gbs. The all have 256 mb ram.  I was hoping to divide
the database into tables of about 200,000 records each so
that I can see load the whole table onto the ram, although
I am not sure that makes any sense. As you can see, I am
not a professional.
  So my first question is that will I be able to run
select queries that will search all the seperate tables on
these 4-5 PCs and give me combined results?  
  My second question concerns a Hash error that comes up
if I try to import records beyond record number 1245456.  
What is an hash error and what can I do about it? Someone
mentioned correcting the hash character in the text file,
but I can not open that file, it is to big to open in
notepad or word.

Imran J. Khan

 
 
 

importing 21 gb text file

Post by Vinc » Sat, 03 May 2003 07:00:31


Hi,
In a technical way, it seems possible, but depends on your DBMS ...
In my opinion, you should avoid to do this coz your volume is
too important to be stored and used in such a workstation architecture.
Moreover, keep in mind that you will degrade your network performance.
I should say you will kill your network !

I need more information, if possible ...
What DBMS ?
Where do your source data come from ? Why such a huge flat file ? Does
the flat file come from a single table from your legacy system or does it
correspond to huge denormalization ? Does it store facts, dimensions or
both ?
Is there any possibilities to divide your data in a FUNCTIONNAL way,
not in a technical point of view. This could be a first step toward a
better architecture design.

Vince


> I am trying to import a tab delimited text file of size 21
> gb with an estimated 63million records.  I do not have a
> multi-processor machine.  Only 4 or five 2.0 GHz Pentium
> IV that can network.  Two machines have about 30 Gbs to
> spare and the other two or three could probably spare 10-
> 12 gbs. The all have 256 mb ram.  I was hoping to divide
> the database into tables of about 200,000 records each so
> that I can see load the whole table onto the ram, although
> I am not sure that makes any sense. As you can see, I am
> not a professional.
>   So my first question is that will I be able to run
> select queries that will search all the seperate tables on
> these 4-5 PCs and give me combined results?  
>   My second question concerns a Hash error that comes up
> if I try to import records beyond record number 1245456.  
> What is an hash error and what can I do about it? Someone
> mentioned correcting the hash character in the text file,
> but I can not open that file, it is to big to open in
> notepad or word.

> Imran J. Khan


 
 
 

1. GB when trying to import .MDB file after installing Office 2000

Hello!

When I try to use DTS for imporing data from a MS Access .MDB file I get a
GP error just when the package is about to be executed. I have done this
before and the only difference that I can see is that I have installed
Office 2000.

Does anyone has a solution for this?

Dennis Jeryd, Trix Systems AB
www.trixsystems.se or www.trixsystems.com

2. Max. e.g. 200 recs

3. Diff between 9.21.UC3-1 and 9.21.HC3-1

4. Is the context name global to a database instance?

5. upgrading IIF-9.21-UC2 to IIF-9.21-UC3 and select became

6. Great Oppurtunity

7. FW: Diff between 9.21.UC3-1 and 9.21.HC3-1

8. Importing text files into SQL Server 2000 Database using DTS Import/Export Wizard

9. DTS Import Error When Importing Text File

10. Importing text from Text file- Help needed

11. DTS Import from a Text File with Record Info in Header Text

12. Importing a Text file into a single text field