Version 2 of Parallel Data Warehouse (PDW) is apparently due this month (and can be ordered as of March 1st).  The official name is SQL Server 2012 Parallel Data Warehouse.  Here are some of the major new features:

  • Will have xVelocity columnstore index support (along with the new columnstore index features of being updatable and able to be a clustered index) making it 10-50x faster and 15x compression
  • Will use Windows 2012 Storage Spaces
  • Uses Hyper-V, with everything virtualized
  • Runs SQL Server 2012 and Windows Server 2012 Standard
  • Failover is now handled by Hyper-V, replacing HPC
  • Uses DAS (Direct-Attached Storage) via SAS JBOD, versus a previous strategy of simulating shared-nothing on a SAN (Storage Area Network)
  • Will include a new data-processing engine called PolyBase, which is designed to enable queries across relational data and non-relational Hadoop data in the Hadoop Distributed File System (HDFS).  You can create an external table in SQL Server (kinda like a linked server) and you can query it with T-SQL.  So you can retrieve data from HDFS with a PDW query (seamlessly joining structured and semi-structured data), you can import data from HDFS to PDW, and you can export data from PDW to HDFS.  Microsoft Technical Fellow David Dewitt is one of the principals behind PolyBase.  PolyBase will be used in PDW for now, and later it will be added to SQL Server.  See Seamless insights on structured and unstructured data with SQL Server 2012 Parallel Data Warehouse
  • Will have an updated distributed query processor and a new admin console
  • Improved speed.  At PASS 2012 they demoed a 1PB data warehouse query finishing in under two seconds
  • Upgraded hardware.  For HP, the new appliance is called the Enterprise Data Warehouse, or EDW V2.  In addition to increased CPU processing power (16 cores per EDW V2 compute node vs. 12 cores for EDW V1 compute node), the EDW V2 contains 256GB of memory per server vs.96GB for EDW V1.  The EDW V2 appliance also contains 35 disks per EDW V2 Compute node vs. 11 or 24 disk options on the EDW V1 appliance.  This will allow customers to grow the EDW V2 to support up to 5 PB (Petabytes) of data.  Finally, EDW V2 is available in a quarter rack system (with EDW V1, two full racks, control plus compute, was the smallest you can go).  Hardware will be ProLiant Gen8 DEL360 with up to 8 compute nodes per rack and up to 7 racks
  • Dell hardware will be PowerEdge R620 with up to 9 compute notes per rack and up to 6 racks
  • Allows customers to use their own hardware to perform backup operations (V1 required customers to purchase a backup node and its respective storage)
  • Direct query with Power View, PowerPivot, PerformancePoint
  • Use SSMS instead of Nexus
  • 2.5x lower price per terabyte and 50% lower total hardware list price
  • Up to 70% more storage capacity
  • Up to double the rate of data loading speed
  • Support 0TB – 5PB (v1 was 50TB – 600TB)
  • Workload management enhanced: 4 predefined resource classes as server roles; allocation of fixed amounts of memory and PDW concurrency slots; an administrator can associate principles with resource classes; works similar to resource governor

More info:

Appliance: Parallel Data Warehouse (PDW)

Video Parallel Data Warehouse Version 2

The EDW evolution continues and it is Bigger, Faster and Better !

SQL Server 2012 PDW: Game on!