2022-08-07 20:10:20.146 [main] INFO VMInfo - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl 2022-08-07 20:10:20.151 [main] INFO Engine - the machine info => osInfo: Ubuntu 11 11.0.16+8-post-Ubuntu-0ubuntu120.04 jvmInfo: Linux amd64 5.13.0-52-generic cpu num: 8 totalPhysicalMemory: -0.00G freePhysicalMemory: -0.00G maxFileDescriptorCount: -1 currentOpenFileDescriptorCount: -1 GC Names [G1 Young Generation, G1 Old Generation] MEMORY_NAME | allocation_size | init_size CodeHeap 'profiled nmethods' | 117.21MB | 2.44MB G1 Old Gen | 1,024.00MB | 970.00MB G1 Survivor Space | -0.00MB | 0.00MB CodeHeap 'non-profiled nmethods' | 117.22MB | 2.44MB Compressed Class Space | 1,024.00MB | 0.00MB Metaspace | -0.00MB | 0.00MB G1 Eden Space | -0.00MB | 54.00MB CodeHeap 'non-nmethods' | 5.57MB | 2.44MB 2022-08-07 20:10:20.162 [main] INFO Engine - { "content":[ { "reader":{ "name":"hdfsreader", "parameter":{ "column":[ "*" ], "defaultFS":"hdfs://hadoop03:8020/", "encoding":"UTF-8", "fieldDelimiter":"\t", "fileType":"text", "path":"/user/hive/warehouse/user_info/user_info_data.txt" } }, "writer":{ "name":"hdfswriter", "parameter":{ "column":[ { "name":"user_id", "type":"string" }, { "name":"area_id", "type":"string" }, { "name":"age", "type":"int" }, { "name":"occupation", "type":"string" } ], "compress":"", "defaultFS":"hdfs://hadoop03:8020/", "fieldDelimiter":"\t", "fileName":"user_info_data_2.txt", "fileType":"text", "path":"/user/hive/warehouse/user_info/", "writeMode":"append" } } } ], "setting":{ "speed":{ "channel":"2" } } } 2022-08-07 20:10:20.172 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null 2022-08-07 20:10:20.173 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0 2022-08-07 20:10:20.173 [main] INFO JobContainer - DataX jobContainer starts job. 2022-08-07 20:10:20.174 [main] INFO JobContainer - Set jobId = 0 2022-08-07 20:10:20.185 [job-0] INFO HdfsReader$Job - init() begin... 2022-08-07 20:10:20.381 [job-0] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":[]} 2022-08-07 20:10:20.381 [job-0] INFO HdfsReader$Job - init() ok and end... 2022-08-07 20:10:20.969 [job-0] INFO JobContainer - jobContainer starts to do prepare ... 2022-08-07 20:10:20.969 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] do prepare work . 2022-08-07 20:10:20.969 [job-0] INFO HdfsReader$Job - prepare(), start to getAllFiles... 2022-08-07 20:10:20.969 [job-0] INFO HdfsReader$Job - get HDFS all files in path = [/user/hive/warehouse/user_info/user_info_data.txt] 2022-08-07 20:10:21.395 [job-0] INFO HdfsReader$Job - [hdfs://hadoop03:8020/user/hive/warehouse/user_info/user_info_data.txt]是[text]类型的文件, 将该文件加入source files列表 2022-08-07 20:10:21.396 [job-0] INFO HdfsReader$Job - 您即将读取的文件数为: [1], 列表为: [hdfs://hadoop03:8020/user/hive/warehouse/user_info/user_info_data.txt] 2022-08-07 20:10:21.396 [job-0] INFO JobContainer - DataX Writer.Job [hdfswriter] do prepare work . 2022-08-07 20:10:21.436 [job-0] INFO HdfsWriter$Job - 由于您配置了writeMode append, 写入前不做清理工作, [/user/hive/warehouse/user_info/] 目录下写入相应文件名前缀 [user_info_data_2.txt] 的文件 2022-08-07 20:10:21.437 [job-0] INFO JobContainer - jobContainer starts to do split ... 2022-08-07 20:10:21.437 [job-0] INFO JobContainer - Job set Channel-Number to 2 channels. 2022-08-07 20:10:21.437 [job-0] INFO HdfsReader$Job - split() begin... 2022-08-07 20:10:21.437 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] splits to [1] tasks. 2022-08-07 20:10:21.438 [job-0] INFO HdfsWriter$Job - begin do split... 2022-08-07 20:10:21.441 [job-0] INFO HdfsWriter$Job - splited write file name:[hdfs://hadoop03:8020//user/hive/warehouse/user_info__a2c98626_89ef_412a_b96c_27a33bea62e5/user_info_data_2.txt__2a26328e_d57c_4c0e_8e2d_f14c3598829c] 2022-08-07 20:10:21.441 [job-0] INFO HdfsWriter$Job - end do split. 2022-08-07 20:10:21.441 [job-0] INFO JobContainer - DataX Writer.Job [hdfswriter] splits to [1] tasks. 2022-08-07 20:10:21.447 [job-0] INFO JobContainer - jobContainer starts to do schedule ... 2022-08-07 20:10:21.450 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups. 2022-08-07 20:10:21.451 [job-0] INFO JobContainer - Running by standalone Mode. 2022-08-07 20:10:21.454 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks. 2022-08-07 20:10:21.459 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated. 2022-08-07 20:10:21.459 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated. 2022-08-07 20:10:21.466 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started 2022-08-07 20:10:21.481 [0-0-0-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]} 2022-08-07 20:10:21.482 [0-0-0-reader] INFO Reader$Task - read start 2022-08-07 20:10:21.482 [0-0-0-reader] INFO Reader$Task - reading file : [hdfs://hadoop03:8020/user/hive/warehouse/user_info/user_info_data.txt] 2022-08-07 20:10:21.484 [0-0-0-writer] INFO HdfsWriter$Task - begin do write... 2022-08-07 20:10:21.485 [0-0-0-writer] INFO HdfsWriter$Task - write to file : [hdfs://hadoop03:8020//user/hive/warehouse/user_info__a2c98626_89ef_412a_b96c_27a33bea62e5/user_info_data_2.txt__2a26328e_d57c_4c0e_8e2d_f14c3598829c] 2022-08-07 20:10:21.498 [0-0-0-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":"\t","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":"\"","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null] 2022-08-07 20:10:21.507 [0-0-0-reader] INFO Reader$Task - end read source files... 2022-08-07 20:10:21.606 [0-0-0-writer] INFO HdfsWriter$Task - end do write 2022-08-07 20:10:21.667 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[204]ms 2022-08-07 20:10:21.667 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks. 2022-08-07 20:10:31.463 [job-0] INFO StandAloneJobContainerCommunicator - Total 4 records, 79 bytes | Speed 7B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00% 2022-08-07 20:10:31.464 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks. 2022-08-07 20:10:31.465 [job-0] INFO JobContainer - DataX Writer.Job [hdfswriter] do post work. 2022-08-07 20:10:31.466 [job-0] INFO HdfsWriter$Job - start rename file [hdfs://hadoop03:8020//user/hive/warehouse/user_info__a2c98626_89ef_412a_b96c_27a33bea62e5/user_info_data_2.txt__2a26328e_d57c_4c0e_8e2d_f14c3598829c] to file [hdfs://hadoop03:8020//user/hive/warehouse/user_info/user_info_data_2.txt__2a26328e_d57c_4c0e_8e2d_f14c3598829c]. 2022-08-07 20:10:31.478 [job-0] INFO HdfsWriter$Job - finish rename file [hdfs://hadoop03:8020//user/hive/warehouse/user_info__a2c98626_89ef_412a_b96c_27a33bea62e5/user_info_data_2.txt__2a26328e_d57c_4c0e_8e2d_f14c3598829c] to file [hdfs://hadoop03:8020//user/hive/warehouse/user_info/user_info_data_2.txt__2a26328e_d57c_4c0e_8e2d_f14c3598829c]. 2022-08-07 20:10:31.478 [job-0] INFO HdfsWriter$Job - start delete tmp dir [hdfs://hadoop03:8020/user/hive/warehouse/user_info__a2c98626_89ef_412a_b96c_27a33bea62e5] . 2022-08-07 20:10:31.486 [job-0] INFO HdfsWriter$Job - finish delete tmp dir [hdfs://hadoop03:8020/user/hive/warehouse/user_info__a2c98626_89ef_412a_b96c_27a33bea62e5] . 2022-08-07 20:10:31.487 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] do post work. 2022-08-07 20:10:31.487 [job-0] INFO JobContainer - DataX jobId [0] completed successfully. 2022-08-07 20:10:31.488 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /home/zhouze/PycharmProjects/yili/yili-portal/app/datax/hook 2022-08-07 20:10:31.591 [job-0] INFO JobContainer - [total cpu info] => averageCpu | maxDeltaCpu | minDeltaCpu -1.00% | -1.00% | -1.00% [total gc info] => NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime G1 Young Generation | 5 | 5 | 5 | 0.047s | 0.047s | 0.047s G1 Old Generation | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s 2022-08-07 20:10:31.591 [job-0] INFO JobContainer - PerfTrace not enable! 2022-08-07 20:10:31.592 [job-0] INFO StandAloneJobContainerCommunicator - Total 4 records, 79 bytes | Speed 7B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00% 2022-08-07 20:10:31.593 [job-0] INFO JobContainer - 任务启动时刻 : 2022-08-07 20:10:20 任务结束时刻 : 2022-08-07 20:10:31 任务总计耗时 : 11s 任务平均流量 : 7B/s 记录写入速度 : 0rec/s 读出记录总数 : 4 读写失败总数 : 0