Hadoop Distributed File System (HDFS): A reliable, high-bandwidth, low-cost, data storage cluster that facilitates the management of related files across machines.
Hadoop MapReduce: A high-performance parallel/distributed data-processing implementation of the MapReduce algorithm.
Hadoop YARN: A framework for job scheduling and cluster resource management.
Hadoop Common: The common utilities that support the other Hadoop modules.
Usage: hdfs fsck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]] # 你要检测的目录,如果不写默认为根目录 / <path> start checking from this path # 把损坏的文件移动到/lost+found -move move corrupted files to /lost+found # 直接删除损坏的文件 -delete delete corrupted files # 打印被检测的文件 -files print out files being checked # 打印检测中的正在被写入的文件 -openforwrite print out files opened for write # 检测的文件包括系统snapShot快照目录下的 -includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it # 打印损坏的块及其所属的文件 -list-corruptfileblocks print out list of missing blocks and files they belong to # 打印 block 的信息 -blocks print out block report # 打印 block 的位置,即在哪个节点 -locations print out locations for every block # 打印 block 所在rack -racks print out network topology for data-node locations # 打印 block 存储的策略信息 -storagepolicies print out storage policy summary for the blocks # 打印指定blockId所属块的状况,位置等信息 -blockId print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)