Monday, August 31, 2015

Connecting IntelliJ to Redshift

Are you using Amazon's Redshift Service and you want to connect to it via IntelliJ?  Here's how you do it.

Download Redshift JDBC Driver from Amazon

  1. In your Redshift console, go to the "Connect Client" section and download the Redshift JDBC driver from Amazon.
  2. Copy the file from your Downloads folder to somewhere more permanent, like your home directory or a location accessible to all users on the computer.
  3. Set permissions to allow everyone to read the jar file:
    chmod a+r RedshiftJDBC41-1.1.7.1007.jar

Add Redshift Driver to IntelliJ

  1. In IntelliJ, go to View / Tool Windows / Database
  2. Open the Data Source Properties
  3. Press the + button and select "Database Driver" to create a new Redshift Driver
  4. In the "Driver Files" section, press the + button and select the path where you put the driver JAR file.
  5. If the driver can be read successfully, the Class Name drop down should show more than just "java.sql.Driver".  
    • In my case, it couldn't be read because OS X had put the file in quarantine mode, which you can see with the "ls -l@" command:
      ls -l@ *.jar
      -rw-r--r--@ 1 root  wheel  2198193 Aug 28 13:42 RedshiftJDBC41-1.1.7.1007.jar
       com.apple.metadata:kMDItemWhereFroms     211
       com.apple.quarantine      67

    • The quarantine attribute can be removed with this command:
      xattr -dr com.apple.quarantine RedshiftJDBC41-1.1.7.1007.jar

  6. Use the following settings:
    • Class Name: com.amazon.redshift.jdbc41.Driver
    • URL Template: jdbc:redshift://{host}:{port}/{database}
    • Default Port: 5439 (or whatever your cluster uses)
    • Dialect: PostgreSQL
    • Requires Authentication: Yes
    • Local Database: No
    • Auto Commit: Yes for convenience, No for safety (This Requires you to explicitly commit after executing DML)
    • Auto Sync: Yes (turn this off if it slows you down too much)
  7. Click Apply

Connect to your Redshift Instance with the new Driver


  1. In the Same Data Sources and Drivers Window in IntelliJ, press the + button again and select Redshift
  2. Go to your Redshift Console in AWS.  Copy the string under "Endpoint" into the "Host" field in IntelliJ.  this is in host:port format, so cut the port off, delete the ":" and paste the port into the port field.
  3. Enter your database name, user, and password, and test the connection.  
  4. Hit OK.  You can now query Redshift from your IDE!


Thursday, June 10, 2010

Bucket DateTimes in T-SQL

Today at work I was working on a SQL Server Reporting Services report which shows us the number of rows being created in a certain table in each 10-minute interval. While working on this, I tried several different ways in which I could "bucket" the datetime column into the same ten-minute period. At the end, I finally decided to extract this logic into a User-Defined Function which can bucket any datetime into any size interval.

Here's the function:




/* BucketDateTime:
* Buckets any datetime into an interval of the given size.
* Returns the minimum datetime for the interval chosen, or NULL if @MinutesPerBucket < 1
* Example: bucket times 7:29pm and 7:31pm to the nearest 30 minute interval
* SELECT dbo.BucketDateTime('2010-04-01 19:29:00', 30) => '2010-04-01 19:00:00.000'
* SELECT dbo.BucketDateTime('2010-04-01 19:31:00', 30) => '2010-04-01 19:30:00.000'
*/
CREATE FUNCTION [dbo].[BucketDateTime]
(
@SourceDateTime DATETIME, -- The DateTime to bucket
@MinutesPerBucket INT -- The number of minutes per bucket.
)
RETURNS DATETIME
AS
BEGIN

IF (@MinutesPerBucket < 1) RETURN NULL

-- The earliest possible datetime
DECLARE @Epoch DATETIME

-- # of minutes passed between the epoch and source datetime
DECLARE @MinutesSinceEpoch INT

-- The value to bucket @SourceDateTime into, based on @MinutesPerBucket
DECLARE @BucketDateTime DATETIME

-- # of minutes passed between @BucketDateTime
DECLARE @BucketMinutesSinceEpoch INT

-- This value is from Books online: http://msdn.microsoft.com/en-us/library/ms186724.aspx
SET @Epoch = CAST('1753-01-01' AS DATETIME)

SET @MinutesSinceEpoch = DATEDIFF(mi, @Epoch, @SourceDateTime)

SET @BucketMinutesSinceEpoch = @MinutesSinceEpoch - (@MinutesSinceEpoch % @MinutesPerBucket)

SET @BucketDateTime = DATEADD(mi, @BucketMinutesSinceEpoch, @Epoch)

RETURN @BucketDateTime

END

GO


Tuesday, June 8, 2010

Execute LINQ queries without locking any tables

At work I'm building an application which queries a SQL Server backend with LINQ to SQL, and I recently found that some of my queries were slow due to table locking. I found this excellent blog post referencing Scott Hanselman's blog post which describes how to execute LINQ queries without locking the source tables:

Deciding to go with option 1, I came up with an elegant solution for easily adding this to any LINQ queries. Let's say you are using a LINQ query like this:

var query = from f in Foo
select new { Foo = f };

foreach (var item in query)
{
//Do something
}

In this case, the query is actually executed in the foreach loop. By implementing an extension method, you can easily change this to use a transaction with the "Read Uncommitted" Isolation Level like so:

var query = from f in Foo
select new { Foo = f };

foreach (var item in query.ToListNoLock())
{
//Do something
}

The extension method is implemented like so:


public static class IEnumerableExtension
{
public static TransactionScope CreateNoLockTransaction()
{
var options = new TransactionOptions
{
IsolationLevel = IsolationLevel.ReadUncommitted
};
return new TransactionScope(TransactionScopeOption.Required, options);
}


public static List<T> ToListNoLock<T>(this IEnumerable<T> query)
{
using (TransactionScope ts = CreateNoLockTransaction())
{
return query.ToList();
}
}
}


And then of course you can implement new versions of various other extension methods on IEnumerable/IQueryable, such as SingleOrDefault, FirstOrDefault, etc.


Note that this is the same as adding (NOLOCK) hints to EVERY table in your query. I do not currently know of a way to add nolock hints to individual tables.