logo bigbox
Products feature

Bigbox Products feature

Platform features

Data management and analytics functionsProject & componentsBIGBOX Data Platform
Distributed batch processing of large data sets
Apache Hadoop
Database for unstructured & structured data storage of large tables
Apache Hbase +conn, +indx
Reliably store very large files across machines in a large cluster.
Data warehouse summarization & ad hoc querying
Apache Hive
Metadata store for Hive tables
Hive Metastore (HMS)
Workflow scheduler to manage Hadoop jobs
Apache Oozie
Columnar storage format for Hadoop ecosystem
Apache Parquet
Fast compute engine for ETL, ML, DL, stream processing
Apache Spark
Bulk data between Hadoop and structured datastores (Ex : MySQL, Microsoft SQL Server, Oracle, PostgreSQL, MongoDB and others)
Apache Sqoop
Job scheduling and cluster resource management
Coordination service for distributed applications
Apache Zookeeper
Store and manage large data sets across a cluster
Apache Accumulo
Metadata management, data lineage, governance & data catalog
Apache Atlas
Migrate and replicate data to other ecosystem
Apache Ranger
Smallest, fastest columnar storage for Hadoop
Apache ORC
Data-flow framework for batch, interactive use-cases
Apache Tez
Fast analytical queries on event-driven data
Apache Druid
Perimeter security governing access to Hadoop
Apache Knox
Cryptographic key
Ranger KMS
Notebook for interactive analytics
Apache Zeppelin
Distributed processing for stateful computations
Apache Flink
Data serialization system
Apache Avro
Provisioning, managing, and monitoring Apache Hadoop clusters
SQL workbench for data warehouses
Distributed MPP SQL query engine for Hadoop
Column-oriented data store for fast data analytics
Apache Kudu
Enterprise search & index platform
Apache Solr
Real-time streaming data pipelines and apps
Apache Kafka
Distributed object store for Hadoop
Apache Ozone
Scalable directed graphs of data routing, transformation, and system mediation logic
Interactive collaborating and sharing analytics tools
Manage resource and data security across the Hadoop ecosystem
Apache Ranger
Integrate with various format, data source, protocol and other enviroment
Strong authentication for client/server application
Keeping track of the running applications
Distributed realtime computation system
Apache Storm
Data scraping and crawling enginee
Faceted search enginee
The Enterprise Business Intelligence for All Your Needs
Powerful Business Dashboard Software for Everyone
API Management platform
Build Evaluation Model, Ex : Confusion Matrix, Cross Validation, AUC
Apache Spark
Run CDC (Change Data Capture) Method
Data migration and replication between Data Centers
Apache Hadoop
Interoperable with other big data ecosystems
Capability to scaling out / patches without causing service downtime that not impact on the Hadoop ecosystem
Capability to develop modules or jobs and job resumes when a failure occurs
Provides a data security module including authentication, access authorization, audit trail, data masking, encryption and others
Ranger, Knox, Kerberos
Data security management not only on data at rest, but also on data in motion
Ranger, Knox, Kerberos
Storage object feature capabilities by supporting the S3 protocol
Apache Ozone
Build a comprehensive data lineage from the start of data collection to the aggregation stage
Apache Atlas
Integrated capabilities with LDAP or Active Directory authentication services
Resource allocations between each user based on CPU, storage, and memory requirements
Easy to use ETL tools with GUI designer workflow
Ability to keep running when one of the nodes is not functioning (Fault Tolerance)
Scaling out capabilities to serve increasing data processing needs
ETL supports parallel processing of big data frameworks including MapReduce, Spark, Storm, Flink, Tez and others
Supports Object Reusability in ETL development
Unlimited ETL's user
Big Action
Supporting data analytics & modelling with Machine Learning and Deep Learning with Python, R programming and others
Spark, Jupyter, Zeppelin
Analytics platform that can be integrated with big data clusters in the context of authentication, authorization, and resource management
Spark, Ranger
sharing, publishing, and collaborating on data analytics projects
Spark, Jupyter, Zeppelin
Supports parallel processing of big data frameworks including MapReduce, Spark, Jupyter, Zeppelin
Mapreduce, Jupyter, Zeppelin, Spark
Data analysis capabilities, training models, deployment models in the form of APIs, and collaboration facilities
Spark, Jupyter, Zeppelin
Capability to monitor the machine learning models that have been deployed
Spark, Jupyter, Zeppelin
Unlimited Analytics Platform's users
Spark, Zeppelin, Jupyter
The analytics platform can interact from the Hive, HDFS and Solr data sources
Hive, HDFS, Solr, Spark, Jupyter, Zeppelin
The analytics platform can integrate with Oozie, YARN and HDFS Browser
Oozie, YARN, HDFS Browser, Spark, Jupyter, Zeppelin

Copyright © 2021 BigBox. All rights reserved. Various trademarks held by their respective owners. Privacy Policy  |  Terms & Conditions