Review: Php And Mongo Db Web Development Beginner’s Guide

January 24th, 2012 No comments

Php And Mongo Db Web Development Beginner's Guide
Php And Mongo Db Web Development Beginner’s Guide by Rubayeet Islam
My rating: 4 of 5 stars

This was a great book for those looking to get their feet wet with MongoDB. PHP And MongoDB covered many more topics than many of the MongoDB books I have read recently and while I am not a PHP developer gave me a few more ways to leverage MongoDB in my day to day work.

I especially enjoyed the chapters on Geospatial functionality and the GridFS system. Both of these topics we handled throughly and are typically glossed over in other books.

The one place I felt this book was light was the operations and administration side of things. Nowadays, many developers are handling operations as well and I feel that those topics could be explored further in books like this.

All in all, I would recommend anyone in web development looking for somewhere to start with MongoDB pick up this book and give it a read.

View all my reviews

Categories: General Tags:

Infrastructure Monitoring – Scout

December 1st, 2011 1 comment
Scout Server Monitoriing

Scout Server Monitoring

In my last post on Application Deployment I talked about deploying your application straight to production and using metrics to measure it. One of the tools we us is Scout for server monitoring. Scout is a great service that not only monitors you typical system metrics but has a great collection of community built plugins and the capability to write your own plugins (using Ruby).

It has all of your standard features, Graphing, Triggers, Email and SMS alerts, Server Groups, support for cloud instances, etc. But what we really like is the ability to build your own dashboard as well as custom graphs.

Scout ~ 3 Month Graph

We have charts that we built and use on our main monitoring board. These charts combine our Redis Queues, Processing Throughputs and Database Commands. The great part is that all of the data points come from different servers and services that on their own don’t really give enough insight into whether or not there is an actual issue or where that issue maybe. But when they are all combined on to one chart the context for spikes can be seen and appropriate action (if necessary) can be taken.

Mongo Queries and Inserts with Redis Queues

Mongo Queries and Inserts with Redis Queues

Scout was the first monitoring service AuthorityLabs began using back when it was a one server web app with a handful of clients. It has served us well since then growing with us and they are continuing to add items and make the app more robust. We have since supplemented our monitoring with other services and internal tools but Scout is still a go to for us when it comes to building a custom plugin and getting those data points on a graph with our existing metrics.

Next time I will cover how we used New Relic to help us with performance monitoring and application bottlenecks.

Deployment, What Really Matters

November 22nd, 2011 No comments

Do It LiveAt AuthorityLabs we have a saying, “We’ll Do It Live“. This goes to the fact that anyone in the company has the ability to deploy any part of our system to the production environment. There are no keys to the kingdom as it were and it is a rather large kingdom.

Many companies nowadays have this policy and rely on their tests suites and continuous integration systems to catch bugs and failing tests but that to me is the easy part. What we have found over the last year is that it is rarely a bad test or worse a missed test case that causes us problems. It is usually something under performant code or infrastructure issue that pops up at scale. We thinking about our issues we started looking at how we could handle this without taking away the ability for anyone to deploy updates as quickly as possible.

What we found was missing was metric driven deployment. The concept isn’t new, but it is overshadowed with all the talk of CI and continuous deployment. Now when we deploy something, we have a dashboard that show us all of our critical metrics (work throughput, server loads, queue lengths, etc) and these are watched looking for variations that are out of line with the norm and expected. It is amazing what you will see once you hit certain scale thresholds. This system let’s us to a couple of things.

We can rollback the deployment and reevaluate the code, spin up additional resources to make up for the drop, make changes to our infrastructure to deal with things like connection limits or whatever. This has had an interesting and welcome side effect. We now monitor and check many more metrics than before and have a better idea of the health of our system. It also has caused us to think about our code in a different way. We now put more thought into how it is going to affect the system as a whole and immediately add support for any new metric we think will need to be tracked for new features.

Does this mean we have fewer issues? No (we are doing it live remember) but we are able to deal with them faster and in a better way. We aren’t just rolling back after someone complains and then investigating. We are seeing in real-time the actual problems and correcting them were they need to be corrected.

I will follow this up with a post on our tools (Scout , New Relic, etc) and some of the internal things we have done to tie this together. In the meantime sit back and enjoy Mr O’Reilly do it live:

YouTube Preview Image

NoSQL Realities

November 8th, 2011 No comments

My Twitter stream and usual haunts on the Internet have recently seen an increase in the NoSQL bashing. The one common thread seems to be that “pick your NoSQL” solution is not as good as “pick your SQL” solution at “pick your topic”. I am not here to try debunk these statements or prove one or the other wrong, I would just like us to be comparing apples to apples and having a real conversation about when and where to use the right solutions regardless of the camps they fall in.

First let’s be realistic, NoSQL is not going away and will be more and more a part of our lives everyday, so before taking the fanboy comments on Twitter to heart do yourself a favor and read up on the pros and cons on any solution you are going to use and run some tests on your laptop. Most of the time there is more than one solution that will work for your needs and better understanding the focus and future direction of the technology can help make that decision.

OK, now for the part of the conversation I think is missing:

  1. My NoSQL is more performant than your SQL!
    This statement is not only bold, but very vague. What do you mean more performant? Are we talking about server resources, reads per second, writes per second, etc? Come on this is just going to start an argument where everyone is comparing metrics and benchmarks that are not relavent to each other.Also, you can configure most SQL systems to perform on the levels of their NoSQL counterpart but doing so will degrade their performance in other areas. Doing this maybe beneficial for your team/company in not making them learn a new technology, but also hampers you when leveraging some other feature in your SQL that is not configured correctly anymore.
  2. NoSQL is immature and not ready for production
    This will vary by solution. I would argue that the file system is more mature than any SQL solution (yes it is a NoSQL solution), but I would also say that many of the new kids on the block should be tried and tested before moving them to production and you should expect to have problems and find bugs that have already been worked out in the older, more stable SQL systems. This however is not a reason to dismissed the solution, it is a reason to spend more time reading up on it and talking to the few that are running it so you don’t make the same mistakes
  3. NoSQL can’t do everything SQL can
    Of course it can’t, it isn’t meant to. Most NoSQL systems are built to target a very specific pain point and they accomplish by abandoning features and overhead that most SQL systems implement. This doesn’t mean go implement every NoSQL solution known to man to gain a few milliseconds in your system, but if you find a solution that can make a significant impact on the performance of your application or save you a tremendous amount of time, then it may be time to think about moving that functionality into a NoSQL solution.
  4. NoSQL is not secure
    This is true for a lot of NoSQL solutions. I am not sure why this has been handled this way, but there is good news. You can solve this with your operating system and/or firewall. This is a valid concern and you really need to be aware of how this affects you and your data when implementing any solution.

That is a short list of the statements being flung around, but I think you get the idea.

I don’t know of any NoSQL solution that claims to be a drop in replacement for all things SQL. The performance gains many NoSQL solutions are able to claim come at the expense of not being able to do many of the things SQL can and pushing these concerns out of the database system and back up to the developer. This can be both a blessing and a curse, but with frameworks, ORMs and the such these can be mitigated, but that is a whole other issue that could use some discussion and actually muddies the water even more.

Next time you want to bash or defend NoSQL, think about your reasons, the context and the real world implementations then take the conversation somewhere that allows you more than 140 characters.

Categories: NoSQL Tags: ,

Salesforce Acquisitions over the last 5 years

October 6th, 2011 No comments
Salesforce Acquires their way to world domination

Salesforce Acquires their way to world domination

 

 

 

 

 

 

 

 

 

How has Salesforce gone from small web based CRM to a market powerhouse in 5 years?  By spending cash! And not just on development of new features but by acquiring a stable of companies that compliment their company’s vision and strategy.  This infographic created at AuthorityLabs documents the spending spree of the last 5 years.

That is over $831M in acquisitions in over 5 years.
Categories: General Tags:

What Can We Learn From Embedded Development

September 4th, 2011 No comments

We recently kicked off Gangplank Labs over at Gangplank and for the last month or so I have been working with the Arduino microcontroller, XBee wireless, Epilog Laser Engraver/Cutter and a slew of sensors. I do not have a background in Electrical Engineering and have never really played with the embedded side of development other than the Basic Stamp starter set when I was younger, so I dove head first into this relying on my experience in software development. That got me off to a good start, but I hit road blocks I had long ago forgotten about.

In this day and age of cheap processors, cheaper ram and even cheaper storage all on a virtual machine out in the ether of the internet it is easy to lose site of some of the problems experienced in the past and even easier to over come those problems by throwing more “power” at them to improve performance.  Well, that luxury doesn’t exist in the embedded world.  Yes things have gotten better and cheaper, but they still have a larger impact on your design than spinning up an instance with Amazon AWS. That was my first problem, the program has to fit on a limited and fixed storage device. I couldn’t just keep including libraries to make development easier. I had to get selective, creative and sometimes even roll my own. I found myself reading the source of the libraries in much more detail than say a gem I would include in a Rails app.

Then there was the actual design. Not just the case design, but the actual hardware and component design. Being new to the space, I am unfamiliar with all the different chips and design patterns available and that showed. Once I maxed out my available inputs and outputs on the Arduino I had to once again get clever. Fortunately, I had a few people around I could tap on the shoulder and they pointed me down the right path. The problem here is that I have to order what I need and wait for it to ship to me (and I am very impatient). The first couple of times this was ok, but then every time I would iterate over the design and find a better solution I would have to place an order and wait. This caused me two unexpected pain points.  It was starting to increase the cost unnecessarily and it was starting to delay the project. Being used to downloading and purchasing digital licenses had spoiled me. A little more design and planning up front could have save me a couple of weeks and a few bucks.

Updates are also not as easy.  This is a lot closer to desktop development, but even they still have it easier. If you want to auto update that means increasing your cost to include an ethernet or wifi chip onto your device which could put it out of the affordable price range. This really means updates have to be performed like a good ole firmware update. Download the update, connect the device, click run and pray you don’t brick your device. QA is very important here. Test, test, test and test. Not unit tests, I mean hands on field tests. Once you release the device to another party you have no guarantee you will ever get to patch a bug that is in it. Not to mention what if it was hardware and not software causing the bug?

What did I learn? Slow down and think about how my design impacts project more thoroughly. Be selective and include only what you need. There is not enough room for the kitchen sink. And test, test test. Make sure everything is working as planned and is stable.

I would recommend any developer out there grab their team and a few Arduinos. Create a couple fun project ideas that take additional hardware, pair up and go implement them like you would in your software development world. Let the issues come up, deal with them and then think about how it differs from your development world. After your little retrospective see if there is anything you could bring back to your development team that could help you improve your process.

Explore AZ: Payson

August 30th, 2011 No comments
Angelina and NoahWhoa NoahAngelina and NoahAngelinaAngelinaAngelina
ZoeThe Explore AZ CrewVlad the GoatDSC03813DSC03814DSC03815
DSC03816Zoe taking a breakZoe taking a breakZoe taking a breakDSC03823DSC03824
DSC03825Angelina and ZoeDSC03827DSC03828DSC03830Friendly Butterfly

Explore AZ: Payson, a set on Flickr.

We recently went to Payson with some other Gangplankers for a great camping trip to get away from the heat of Phoenix.

It was our first camping trip since moving to AZ and the first one that Zoe remembers. Fortunately she loved it and that means Zoe, Angelina and I now outnumber Shey when it comes to camping and we will be doing much more.

Categories: General Tags:

What Exactly Does Big Data Mean?

July 31st, 2011 No comments

Every time I get into a discussion with someone on the topic of “Big Data” it seems to diverge into one of a handful of subtopics.  Whether it is what size is considered big data, what technology must you be using to be considered big data and what problem are you trying to do with your data.  These are all great buzz worthy topics but does it matter what technology you are using or truly how big your data is?

When it comes to the size of your data everyone will (and should) have a different definition of big.  Really what makes data big is dependent on the resources you have available to manage that data.  With this in mind Walmart or Facebook and their data make what I deal with tiny.  Does that mean I don’t experience similar challenges to them?  No.  While their  task is far larger and greater scope than mine, they also have deeper wallets, more personel and greater technology resources.  So just because you aren’t dealing in Petabytes doesn’t mean you are not dealing with big data.

NoSQL and Hadoop are a requirement for being in the big data space right?  I mean after all that is why those tools where invented and if you aren’t using them to solve your problems then you clearly haven’t entered into the big data space.  While those tools can be useful (or even detrimental) they are not required to be in the big data space.  MySQL, Postgres and the others have been around for ever and people have been using them to solve their data problems without the other tools.  Hell, I know of one company that deals with what I would call Gigantic Data and does it all in flat files (although I guess those are the original NoSQL solution).  It is not the tools you are using, it is how you are using them and what you are trying to accomplish.

Which brings us to what I think the real problem is, what are you trying to solve?  Or better yet what question are you trying to answer.  If you don’t really know, then you are data warehousing and most likely dealing with archiving big data sets but not really in what I would call the big data space.  This is where I think things get muddy for most.  To me all the new tools at our disposal and the fact that storage is so damn inexpensive is causing us to archive everything and we don’t know why, but we have a gut instinct that it will be worth something someday or it holds the answer to some unknown question.  Both of those maybe true, but the data isn’t the piece that is truly valuable, it is the unknown question that will bring value.

“Big Data” is asking your data set for an answer to a question and getting that answer as quickly as possible.

Categories: Data, General, NoSQL Tags:

OSCON Data 2011

July 22nd, 2011 No comments

I am going to be up in Portland for the OSCON Data tracks this upcoming week. Are you going?


OSCON Data 2011

Categories: General, Mongo, MongoDB, Mongoid, NoSQL Tags:

Using memcached: How to scale

May 9th, 2011 No comments

Using memcached: How to scaleUsing memcached: How to scale by Finsel
My rating: 4 of 5 stars

Good book for helping catch those mistakes you could make implementing memcached for the first time. This is definitely worth a read if you are thinking of implementing any NVP cache system. The lessons conveyed in this Using memcached can be applied to not only memcached, but Redis, Riak, etc.

View all my reviews

Categories: General Tags: