Datameer Q&A
What do Read,Write, and Execute Permissions mean inDatameer?
Datameer permissions are built like Linux-based or Mac-basedpermissions. This means you can grant different permissions on artifacts fordifferent groups within Datameer and different users within Datameer.
READ - Allows other users to view your artifact.
WRITE – Allows other users to edit your artifact. This requiresread permissions as well.
EXECUTE – Allows other users to run your artifact (i.e. workbooks,import job, etc.). This requires read permissions as well.
For example, say you create a workbook. You can allow other usersto view your workbook, but not make changes if you only grant READ access tothem. They will be able to see your analysis but not make changes. On the flipside, you can grant WRITE permissions to them, and they would be able to go inand make changes to the workbook. If you grant users EXECUTE access, and thenthey would be able to go in and run the workbook against the entire data set.
Term | Definition |
Job | The complete set of data including the connections, associated analytics (workbooks), and visualization tools (infographics). The job also includes the schedule of when the data gets updated, and whether this happens automatically or manually. |
Connections | The repository of structured, semi-structured, and unstructured data from one or more sources used to create analytics. |
Workbook | Where you view a sample of your data and create analytics, using the built-in functions, sorting, filtering, and other tools to discover relationships in your data set. |
Widgets | The reporting tools you use to easily create tables, charts, graphs, and other visual ways of looking at your data. Widgets let you quickly and visually manipulate your data. |
Infographics | Where you can see at a glance the tables, charts, and graphs you create for visualizing your data. |
You can use Datameer with any type of data such as logfiles, call details records, sales or transactional data, clickstream data, website metrics, social networking data and more. You can combine multipledatasources and data types together to collect the raw data you need foranalysis. You can import data or use the data imported by a system analyst.
Data formats supported include:
- Flat files such as Excel spreadsheets, comma-delimited text files(.CSV), FDFS(File Descriptor File System), Apache log files, and S3(Amazon Simple Storage Service), and unstructured data such as Twitter data.
- Relational databases such Oracle(10g), HSQL-DB, DB2, or MySQL(5.1)
- Other types such as HIVE(a data warehouse infrastructure built on Hadoop)
Raw data is stored andprocessed using Hadoop, which manages and distributes both the data and thecomputational load over multiple computers networked together. The Datameertools allow you to easily analyze and visualize relationships in the data.
Datameerincludes datasource integration, storage, an analytics engine, andvisualization tools.
Datameer is based onHadoop which allows it to scale to accomodate and manipulatelargevolumes of data. Itsupports integrating data from many of the commonlyuseddatabases including Oracle, DB2, MySQL, and from files such as logfiles, twitter data feeds, CSV files, Excel files, or text files.
UseDatameer to analyze customer relationship management content, web logs,customer data, sales data, social media content, and even data from Excelfiles. You can store that data on your own servers or use a service availableon the cloud such as Amazon Web Services.
Datameer provides afamiliar interactive spreadsheet-based interfacethat is easy to use, butalso powerful so that you don’t need to turn to developers for analytics. Thespreadsheet is specifically designed for visualization of big data and includesmore than 200 built-in functions for exploring and discovering complexrelationships. In addition, because Datameer is extensible,you can usefunctions from third-party tools or write your own commands.
Datameer's BusinessInfographics tools include charts, graphs, maps, and allows you to incorporateyour own visual elements to produce stunning, print-quality datavisualizations.
TheDatameer tools allow you to easily Extract, Transform, and Load (ETL)datafrom multiple sources including your current transactional database systemsregardless of source or formats.Then you cananalyze relationshipsin the data using an interactive spreadsheet interface andvisualize theresults of that analysis using the built-in infographicwidgets.
Datameer is specificallydesigned to solve the challenges of accessing, analyzing and usingmassiveamounts of data, leveraging Apache Hadoop open source technology.Datameerenables enterprises to gain insights from all available data sources regardlessof size in a cost effective manner.
Massively parallelprocessing architecture facilitates ultra-fast performance of complexanalytics. Hadoop scales to 4000 servers and petabytes of data andtheapplication processes arefully parallelized inside Hadoopclusters. This dynamic workload optimization utilizes hardware moreefficiently.
Datameer includesbuilt-in fault resilience for high application availability, and elasticexpansion to dynamically expand storage capacity without system downtime. Theadvanced data compression increases performance and decreases storagerequirements.
What is a job?
Ajobsets up the connection toadatasource to import information into Datameer for processing. It can then runat the intervals you specify, for example,when manually triggered, whendata changes, or at a time schedule you set up. That way, you control howcurrent the data is and how frequently it gets updated.
An analytics job runs thecalculations and logic you set up in the workbook and displays the results inthe infographic widgets so you can easily view or share your results.
What are Connections?
Each type of data is setup as aconnectionso it can be used by Datameer. Forexample, you can have sales data from an Oracle orMySQL server, othercontent from a CSV file exported from Excel,twitter feeds about yourcompany and products, and customer call logging data from yet anothersource.You can easily pull all that informationinto Datameer.
How do I connect to various kindsof data?
You create aworkbookin Datameerthat connects to oneor more of these sources of datawhich you can then use to do analysis.For example, you coulduse sales data fromyourcorporatedatabase, twitter feeds, customer call logging data--all fromdifferent sources as the basis for your analysis.
Key concepts forAnalysts
Before you can useDatameer to do analysis, you or a systems administrator need to set up aconnectionand import data into Datameer. Once that is done, you canchoose a job to start analyzing.
When you use a workbook:
·You are setting up an analysiswhile viewinga subsetof the entire dataset. That analysis will then run on the full data
·Each filter you apply to the data creates a new tab(sheet)in the workbook, and is one step of the analysis
·You can click each tab of the workbookto see the results ofthat step of analysis
·Some pages are read-only and others are editable (you can easilytell by viewing the tabs)
·Calculations apply to columns, not to a range of cells in acolumn
·There aremore than 175 built-in formulas
·You can add custom formulas--usingthe ApplicationProgrammer Interface (API)
·You can choose which sheets you want to save along the way
·You can change workbook settings at any time
·You can import a worksheet into another workbook
When you use infographics:
·You areselecting and customizing a way of looking at theanalysis you did in the workbook
·You can set up multiple widgets to look at your data indifferent ways
·The datafields you can show inthe widgets come fromthe sheets that are saved in the workbook
·You can share your infographics with other users by sending alink to them
Key concepts forSystem Administrators
·You set up connectionswhich are a collection of data thatcan be structured, semi-structured, unstructured, or a mix of types
·Jobs are created using connections, and include any associatedworkbooks and infographics created by analysts
·Both you or the analystcan specify when jobs will run
·You can optimize for speed by saving only the sheets you need inworkbooks
·Datameer provides role-based security features. You can setupgroup permissionsand assign users to groups.
Step 1
As your first step,Openthe first tutorial folderTutorial Hellow Worldin the folderStart Herein theBrowsertab or download the app from theAnalytics App Market.
Step 2
Double-click the "My Upload" file and choose the "Edit" button.
Step 3
Here you would choose the file you wish to upload by selecting "Choose File" and format by selecting the drop down menu. For this example, simply select "Next" to continue.
Step 4
In 'Data Details' leave the box checked "Column names are contained in the first row." Also notice you can choose yourdelimiterand advanced options (e.g. quote character). Click "next" to continue.
Step 5
You'll now see a preview of the data where you can make any changes as needed to the file type, column name, or columns to include. Click "Next" twice to continue to the "Save" Page.
Step 6
Save the File Upload with the same name by clicking "Save" or create a copy by clicking "Save Copy As..."
Step 7
Back in the Browser Tap, click on the + button and select "Analytics" and then "Workbook." From the Link Data browser box, select "MyUpload" from the /_Start Here_/_1_Tutorial Hello World folder and click the blue Link Data button at the bottom of the box.
Step 7
Once you have opened the data in your workbook you can now analyze the data. Click the Apply Filter button on the tool bar and choose "City" as the filter column and "Equals" as the expression, and type "Chicago" as your value. Then click Create Filtered Sheet at the bottom on the box.
Step 8:
To save the workbook, click the "Save Workbook" button in the toolbar and choose a name for the file such as "MyWorkbook" and click save. You will have options on on how and when to run the workbook. Click the box "Start calculation process immediately after save" and then click Next and then Save to finish start the calculation.
Step 9
After the workbook has been calculated you can visualize the analysis of your data though Datameer's Business Infographics™. Click the + in the Browser again and choose Infographic to create your first data visualization.
Step 10
From the “Add Widget” Inspector on the left of your screen, simply drag and drop theBARCHARTonto the canvas. Click on "Data" on the widget to add your data from your workbook. On the right side of your screen, navigate to 'MyWorkbook', again under in the data browser to latest results, then to Sheet1 and drag column 'Name' to the Label field, and column 'Age' to the data field. The widget will update automatically. That’s it! If you want to save this infographic, simply click the Save icon, and you’re all set. Have fun exploring!
To create a connection for a database type that you do notsee listed in the file type list, you need to first add the appropriatedatabase drivers to your Datameer installation.
How to InstallDatabase Drivers
·Go to the driver download site:http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html
·Select the following driver:
ojdbc6.jar(2,111,220 bytes) - Classes for use with JDK 1.6. It contains the JDBCdriver classes except classes for NLS support in Oracle Object and Collectiontypes.
(or select a driver appropriate for your database.)
·In DAS, go to theAdministrationtab and then to theDatabase Driverstab.
Click on theNewbutton to add a newdatabase driver.
·Enter the following information:
·Name: Oracle
·Jar File: ojdbc6.jar (orother file you downloaded)
·Dialect:Oracle-Dialect
·DriverClass: oracle.jdbc.driver.OracleDriver
·ConnectionPattern: jdbc:oracle:thin:@%host%:%port%:%sidName%
·Oracle should now appear as an available database driver.