logo

Hive Interview Questions


Show

Cracking an interview is the most difficult task for your life. You usually tend to get scared by hearing this word as interview shapes your future in the company. However, with the help of our hive interview questions, it has become easier for you to just sit at home and prepare for the interviews. Our experts and professionals know what exactly the interviewers ask in the interview, and accordingly, prepare all the questions and answers that make it easier for you to crack the interview by convincing them.

About Hive

Hive is a warehouse of data and a software solution created on top of Hadoop for offering data query functions and further analysis works. Hive provides a SQL-like interface to analyze the data stored in different databases and other file systems that can integrate with Apache Hadoop. It facilitates reading, writing, and analyzing huge datasets stored in the distributed storage spaces like SQL. It is basically an ETL application for the entire Hadoop ecosystem. If you know Hadoop programming language, then you can apply for a job in this field in any software companies.

Best Hive Interview Questions of 2024

1. What are the different types of tables available in HIve?

There are two types. Managed table and external table. In managed table both the data an schema in under control of hive but in external table only the schema is under control of Hive.

2. Is Hive suitable to be used for OLTP systems? Why?

No Hive does not provide insert and update at row level. So it is not suitable for OLTP system.

3. Can a table be renamed in Hive?

Alter Table table_name RENAME TO new_name

4. Can we change the data type of a column in a hive table?

Using REPLACE column option

ALTER TABLE table_name REPLACE COLUMNS ……

5. What is a metastore in Hive?

It is a relational database storing the metadata of hive tables, partitions, Hive databases etc

6. What is the need for custom Serde?

Depending on the nature of data the user has, the inbuilt SerDe may not satisfy the format of the data. SO users need to write their own java code to satisfy their data format requirements.

7. Why do we need Hive?

Hive is a tool in Hadoop ecosystem which provides an interface to organize and query data in a databse like fashion and write SQL like queries. It is suitable for accessing and analyzing data in Hadoop using SQL syntax.

8. What is the default location where hive stores table data?

hdfs://namenode_server/user/hive/warehouse

9. What are the three different modes in which hive can be run?

  • Local mode
  • Distributed mode
  • Pseudodistributed mode

10. Is there a date data type in Hive?

Yes. The TIMESTAMP data types stores date in java.sql.timestamp format

11. What are collection data types in Hive?

There are three collection data types in Hive.

  • ARRAY
  • MAP
  • STRUCT

12. Can we run unix shell commands from hive? Give example.

Yes, using the ! mark just before the command.

For example !pwd at hive prompt will list the current directory.

13. What is a Hive variable? What for we use it?

The hive variable is variable created in the Hive environment that can be referenced by Hive scripts. It is used to pass some values to the hive queries when the query starts executing.

14. Can hive queries be executed from script files? How?

Using the source command.

Example ?

Hive> source /path/to/file/file_with_query.hql

15. What is the importance of .hiverc file?

It is a file containing list of commands needs to run when the hive CLI starts. For example setting the strict mode to be true etc.

16. It is a file containing list of commands needs to run when the hive CLI starts. For example setting the strict mode to be true etc.

The default record delimiter is ? \n

And the filed delimiters are ? \001,\002,\003

17. What do you mean by schema on read?

The schema is validated with the data when reading the data and not enforced when writing data.

18. How do you list all databases whose name starts with p?

SHOW DATABASES LIKE ‘p.*’

19. What does the “USE” command in hive do?

With the use command you fix the database on which all the subsequent hive queries will run.

20. How can you delete the DBPROPERTY in Hive?

There is no way you can delete the DBPROPERTY.

21. What is the significance of the line

set hive.mapred.mode = strict;

It sets the mapreduce jobs to strict mode.By which the queries on partitioned tables can not run without a WHERE clause. This prevents very large job running for long time.

22. How do you check if a particular partition exists?

This can be done with following query

SHOW PARTITIONS table_name PARTITION(partitioned_column=’partition_value’)

23. Which java class handles the Input record encoding into files which store the tables in Hive?

org.apache.hadoop.mapred.TextInputFormat

24. Which java class handles the output record encoding into files which result from Hive queries?

org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

25. What is the significance of ‘IF EXISTS” clause while dropping a table?

When we issue the command DROP TABLE IF EXISTS table_name

Hive throws an error if the table being dropped does not exist in the first place.

26. When you point a partition of a hive table to a new directory, what happens to the data?

The data stays in the old location. It has to be moved manually.

27. Write a query to insert a new column(new_col INT) into a hiev table (htab) at a position before an existing column (x_col)

ALTER TABLE table_name

CHANGE COLUMN new_col INT

BEFORE x_col

28. Does the archiving of Hive tables give any space saving in HDFS?

No. It only reduces the number of files which becomes easier for namenode to manage.

29. How can you stop a partition form being queried?

By using the ENABLE OFFLINE clause with ALTER TABLE atatement.

30. While loading data into a hive table using the LOAD DATA clause, how do you specify it is a hdfs file and not a local file ?

By Omitting the LOCAL CLAUSE in the LOAD DATA statement.

Career scopes and salary scale

There are many opportunities in this field if you have the right expertise and knowledge. For you to become an expert in this field, you can pursue your career in this field, from which you can ensure to gather an adequate amount of knowledge and expertise. With the technological space growing day by day in the market, there are lots of opportunities and scopes improving in this field. Most of the companies are looking to hire people who know how to work on Hadoop and Hive. For a fresher, if you are getting an opportunity to join a firm, then you can get paid up to 6,000 dollars to 12,000 dollars per annum initially, while for the experienced ones, they can earn up to 50,000 dollars per annum.

Conclusion

If you want to become an expert in this field, then it’s important for you to learn the right procedures and gather a sufficient amount of knowledge. You can gather the same just by reading our Hive interview questions and answers.