•  
Results 1 to 5 of 5

Thread: Error "line too long" while using GpLoad

Hybrid View

  1. #1
    Join Date
    Jan 2012
    Posts
    6

    Default Error "line too long" while using GpLoad

    Hi,

    I was trying to load a 6GB data file to my Greenplum database, when I encountered the following error:

    2012-04-30 00:34:04|ERROR|ERROR: gpfdist error - line too long in file (url.c:1192) (seg0 slice1 warrior:40000 pid=7466) (cdbdisp.c:1457)
    DETAIL: External table ext_gpload20120430_002808_27522, line N/A of gpfdist://abc:8000//home//inputData.dat: ""
    encountered while running INSERT INTO public.target_table ("col1","col2","col3","col4") SELECT "col1","col2","col3","col4" FROM ext_gpload20120430_002808_27522



    The Gpload documentations has the following option for gpfdist:
    -m max_length
    Sets the maximum allowed data row length in bytes. Default is 32768. Should be used when user data includes very wide rows (or when line too long error message occurs). Should not be used otherwise as it increases resource allocation. Valid range is 32K to 1MB.


    Can anyone throw some light on what may be causing this error, I've never come across it before although I've been using Gpload for quite some time now.
    Also, how can I set a gpfdist option when using gpload?

  2. #2
    Join Date
    Apr 2010
    Posts
    16

    Default line to long error explaination

    we see this frequently when the data contains control characters that are not properly escaped. You should set the escape option to a character not present in the data.

    ESCAPE
    Specifies the single character that is used for C escape sequences (such as
    \n,\t,\100, and so on) and for escaping data characters that might otherwise be
    taken as row or column delimiters. Make sure to choose an escape character that is
    not used anywhere in your actual column data. The default escape character is a \
    (backslash) for text-formatted files and a " (double quote) for csv-formatted files,
    however it is possible to specify another character to represent an escape. It is also
    possible to disable escaping in text-formatted files by specifying the value 'OFF' as
    the escape value. This is very useful for data such as text-formatted web log data
    that has many embedded backslashes that are not intended to be escapes.

  3. #3
    Join Date
    Mar 2011
    Posts
    9

    Default

    FYI, there's an undocumented option that can be put into the INPUT section of the gpload control file (which I discovered by reading gpload.py):

    - MAX_LINE_LENGTH: 32768

    But yes, it's often caused by an unescaped quote somewhere in the middle of the data.

    --Joe

  4. #4
    Join Date
    Apr 2010
    Posts
    11

    Default

    Also, I’ve seen this “line too long” error when loading compressed files (gzip) where the file turned out to be corrupted. Are you loading compressed files such that some “bad gzip files” might be getting through to the load process? Correcting the corrupted file resolves the problem for us. (running gunzip –t filename validates gz files are good or not.)
    Dave

  5. #5
    Join Date
    Jun 2012
    Posts
    17

    Default

    i have also seen it loading geospatial data. polygon data can get huge.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •