A noble approach to develop dynamically scalable namenode in hadoop distributed file system using secondary storage

Tumpa Rani Shaha; Md Nasim Akhtar; Fatema Tuj Johora; Md Zakir Hossain; Mostafijur Rahman; R. Badlishah Ahmad

doi:10.11591/ijeecs.v13.i2.pp729-736

A noble approach to develop dynamically scalable namenode in hadoop distributed file system using secondary storage

Tumpa Rani Shaha^*, Md Nasim Akhtar, Fatema Tuj Johora, Md Zakir Hossain, Mostafijur Rahman, R. Badlishah Ahmad

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

For scalable data storage, Hadoop is widely used nowadays. It provides a distributed file system that stores data on the compute nodes. Basically, it represents a master/slave architecture that consists of a NameNode and copious Data Nodes. Data Nodes contain application data and metadata of application data resides in the Main Memory of NameNode. In cached approach, they fragment the metadata depending on the last access time and move the least frequently used data to secondary memory. If the requested data is not found in main memory then the secondary data will be loaded again on the RAM. So when the secondary data reloads to the primary memory then the NameNode main memory limitation arises again. The focus of this research is to reduce the namespace problem of main memory and to make the system dynamically scalable. A new Metadata Fragmentation Algorithm is proposed that separates the metadata list of NameNode dynamically. The NameNode creates Secondary Memory File in perspective of the threshold value and allocates secondary memory location based on the requirement. According to the proposed algorithm the maximum third, out of fourth of main memory is used at the secondary file caching time. The free space aids in faster operation by Dynamically Scalable NameNode approach. This proposed algorithm shows that the space utilization is increased to 17% and time utilization is increased to 0.0005% with the comparison of the existing fragmentation algorithm.

Original language	English
Pages (from-to)	729-736
Number of pages	8
Journal	Indonesian Journal of Electrical Engineering and Computer Science
Volume	13
Issue number	2
DOIs	https://doi.org/10.11591/ijeecs.v13.i2.pp729-736
Publication status	Published - Feb 1 2019
Externally published	Yes

Keywords

Datanode
Hadoop
Metadata
Namenode
Secondary storage

ASJC Scopus subject areas

Signal Processing
Information Systems
Hardware and Architecture
Computer Networks and Communications
Control and Optimization
Electrical and Electronic Engineering

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.11591/ijeecs.v13.i2.pp729-736

Cite this

@article{1ebfebcff27f4f58ad2c163fc70fb65a,

title = "A noble approach to develop dynamically scalable namenode in hadoop distributed file system using secondary storage",

abstract = "For scalable data storage, Hadoop is widely used nowadays. It provides a distributed file system that stores data on the compute nodes. Basically, it represents a master/slave architecture that consists of a NameNode and copious Data Nodes. Data Nodes contain application data and metadata of application data resides in the Main Memory of NameNode. In cached approach, they fragment the metadata depending on the last access time and move the least frequently used data to secondary memory. If the requested data is not found in main memory then the secondary data will be loaded again on the RAM. So when the secondary data reloads to the primary memory then the NameNode main memory limitation arises again. The focus of this research is to reduce the namespace problem of main memory and to make the system dynamically scalable. A new Metadata Fragmentation Algorithm is proposed that separates the metadata list of NameNode dynamically. The NameNode creates Secondary Memory File in perspective of the threshold value and allocates secondary memory location based on the requirement. According to the proposed algorithm the maximum third, out of fourth of main memory is used at the secondary file caching time. The free space aids in faster operation by Dynamically Scalable NameNode approach. This proposed algorithm shows that the space utilization is increased to 17% and time utilization is increased to 0.0005% with the comparison of the existing fragmentation algorithm.",

keywords = "Datanode, Hadoop, Metadata, Namenode, Secondary storage",

author = "Shaha, {Tumpa Rani} and Akhtar, {Md Nasim} and Johora, {Fatema Tuj} and Hossain, {Md Zakir} and Mostafijur Rahman and Ahmad, {R. Badlishah}",

year = "2019",

month = feb,

day = "1",

doi = "10.11591/ijeecs.v13.i2.pp729-736",

language = "English",

volume = "13",

pages = "729--736",

journal = "Indonesian Journal of Electrical Engineering and Computer Science",

issn = "2502-4752",

publisher = "Institute of Advanced Engineering and Science (IAES)",

number = "2",

}

TY - JOUR

T1 - A noble approach to develop dynamically scalable namenode in hadoop distributed file system using secondary storage

AU - Shaha, Tumpa Rani

AU - Akhtar, Md Nasim

AU - Johora, Fatema Tuj

AU - Hossain, Md Zakir

AU - Rahman, Mostafijur

AU - Ahmad, R. Badlishah

PY - 2019/2/1

Y1 - 2019/2/1

N2 - For scalable data storage, Hadoop is widely used nowadays. It provides a distributed file system that stores data on the compute nodes. Basically, it represents a master/slave architecture that consists of a NameNode and copious Data Nodes. Data Nodes contain application data and metadata of application data resides in the Main Memory of NameNode. In cached approach, they fragment the metadata depending on the last access time and move the least frequently used data to secondary memory. If the requested data is not found in main memory then the secondary data will be loaded again on the RAM. So when the secondary data reloads to the primary memory then the NameNode main memory limitation arises again. The focus of this research is to reduce the namespace problem of main memory and to make the system dynamically scalable. A new Metadata Fragmentation Algorithm is proposed that separates the metadata list of NameNode dynamically. The NameNode creates Secondary Memory File in perspective of the threshold value and allocates secondary memory location based on the requirement. According to the proposed algorithm the maximum third, out of fourth of main memory is used at the secondary file caching time. The free space aids in faster operation by Dynamically Scalable NameNode approach. This proposed algorithm shows that the space utilization is increased to 17% and time utilization is increased to 0.0005% with the comparison of the existing fragmentation algorithm.

AB - For scalable data storage, Hadoop is widely used nowadays. It provides a distributed file system that stores data on the compute nodes. Basically, it represents a master/slave architecture that consists of a NameNode and copious Data Nodes. Data Nodes contain application data and metadata of application data resides in the Main Memory of NameNode. In cached approach, they fragment the metadata depending on the last access time and move the least frequently used data to secondary memory. If the requested data is not found in main memory then the secondary data will be loaded again on the RAM. So when the secondary data reloads to the primary memory then the NameNode main memory limitation arises again. The focus of this research is to reduce the namespace problem of main memory and to make the system dynamically scalable. A new Metadata Fragmentation Algorithm is proposed that separates the metadata list of NameNode dynamically. The NameNode creates Secondary Memory File in perspective of the threshold value and allocates secondary memory location based on the requirement. According to the proposed algorithm the maximum third, out of fourth of main memory is used at the secondary file caching time. The free space aids in faster operation by Dynamically Scalable NameNode approach. This proposed algorithm shows that the space utilization is increased to 17% and time utilization is increased to 0.0005% with the comparison of the existing fragmentation algorithm.

KW - Datanode

KW - Hadoop

KW - Metadata

KW - Namenode

KW - Secondary storage

UR - http://www.scopus.com/inward/record.url?scp=85060852974&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060852974&partnerID=8YFLogxK

U2 - 10.11591/ijeecs.v13.i2.pp729-736

DO - 10.11591/ijeecs.v13.i2.pp729-736

M3 - Article

AN - SCOPUS:85060852974

SN - 2502-4752

VL - 13

SP - 729

EP - 736

JO - Indonesian Journal of Electrical Engineering and Computer Science

JF - Indonesian Journal of Electrical Engineering and Computer Science

IS - 2

ER -

A noble approach to develop dynamically scalable namenode in hadoop distributed file system using secondary storage

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this